|
|
|
Index - Major Sections
Site Map
Product and Services _______________ Index - Same Level Subject
Index - Child Subjects |
IntroductionThe shifting of the emphasis of medical care from "one-time" encounter care to long term "integrated managed care" and "evidence-based care" requires that physicians have Evidence Based information and lots of it. This shift is in treating the patient across a continuum of time and locations. It is a shift to treating the whole individual and not just his current symptoms.
This is especially important in developing countries that are generally more responsible for the overall health of their population. Health care treatment should be implemented that provides for the best care over the life time of the patient. The steps to take in research are very clear in the academic world. A randomized controlled trial is the "golden rule" to establish the effectiveness of an event. A researcher develops a hypothesis, he develops indicators to measure his hypothesis, he develops a method to test the hypothesis, he collect data, and finally he analyzes the results to either prove or disprove his hypothesis. Unlike using the golden rule, Data Mining must rely on statistical methods to eliminate bias. Techniques for doing so are well established, and much of the best current health services research relies on these techniques. Data Warehouse Database vs Data Mining DatabaseThe concept that ties together Data Warehousing and Data Mining is the concept of a "large database." The Data Warehouse provides the data that has been organized in a consistent and logical way, and Data Mining provides the intelligence to process that data. Data Mining requires large amounts of data in order to build and train the models that will then be used to perform classification, prediction, estimation, or other data mining tasks (Berry and Linoff, 1997). Data Mining Process In Data mining, you put in all the data, churn the pot, and see what comes out. What works, works and it makes no different if it makes sense (the hypothesis) or not.
The moral of this story -- if you would call it that -- is that there are many relationships in life that just doesn't make any sense but it happens. The problem with the hypothesis step by step method is that first you have to think up the relationship, you have to take time to test it, and it costs money... then it may not work. The Golden Rule:Requires a significant among of time to return the results. By the time that the results are published, the data may no longer be useful. It is static. It is does not allow for the stages of development such as initial introduction, development, maturity, and decline. Different methods of management should be used in different phases of an organizations growth. Requires assumptions in the beginning will may not be valid by the end of the test period. Expensive to do in isolation from other research. Requires that data collection, work processes, and training by set up as a "special events." Tests only the indicators directly attributable to the program, thereby losing much information that may be contributable to other factors such as economic events, or social trends. Since the goal of randomized experiments is to isolate the effects of the interventions, that by itself, implies that no other factors, that may have influences the results, are examined. Therefore the supposed advantage of "isolating the effects" may be detrimental [However, other tests can determine information excluded by the Golden Rule] Results are limited in scope to the region of study undertaken by the research project. In many cases research projects are undertaken at the national level and generalized. This generalization may be an "average" of all the subpopulations and have no relevance to any one particular subpopulation. Unstable environment for the staff and the local community. They know that the project may last only as long as the project. Impact measures requires a time series analysis – for a good time series analysis at least 50 measurements are required. Design control requires that procedures are not changed. This does not provide for management initiatives for good management A project may be significant but it may not be practical Data MiningEarly detection of significant events. Data Mining helps organizations notice client needs, remember their preferences, and learn from past experiences in order to provide better services and health care. It allows an organization to notice patterns. A pattern is a series of events or characteristics that happen regularly enough to be predicable. Once a pattern has been identified we devise rules to explain and predict the pattern. For example, if it is notices that once one family member is brought into the clinic for treatment of certain problems, then very soon after another member of the family will be brought into the clinic. Once the correlation between the two events has been established, based on these rules, we can create new ideas to try to prevent the second member of the family from becoming sick. Reassesses information continuously. Do not have to wait "once a year" to produce a static report. Does not require the withholding of benefits from the "control group." This increases the acceptable of the introduced intervention among political interest groups. By including all members of the community into the results, specific target groups can be identified more easily. By integration of all data, outcome measures and impact measures can be easily obtained on a regular basis with no additional work. With sufficient data, meaningful results can be determined at the lowest level of detail. In this case, there is no worry about contamination from surrounding areas since these groups will also be included in the results. By integration of data into the central database system and using traditional tools for statistical analysis, clinic trials can be dynamically planned, thereby reducing development time and cost. Postmarketing surveillance examines the beneficial and adverse side effects of drugs and equipment from the time it is put on the market. Postmarketing surveillance consists of two stages; a hypothesis generation stage in which an effect or side effect is suspected and a hypothesis verification stage in which the hypothesis is tested. Hypothesis generation is typically based on the spontaneous reporting of potential side effects by physicians. Studies have indicated, that this spontaneous reporting leaves much to be desired. To test a hypothesis, a very large populations need to be monitored (Institute of Medicine, 1997). A centralized database does this automatically and without any added cost.
Data Mining ExamplesAssociationsThese techniques identify similar occurrences between members of a collection. For example, "80% of all clients who leave the program, also have had domestic problems. ClusteringThe segmentation of a population into subsets, or clusters, based on a set of attributes. For example, the ability to segment the known population (e.g., income, education, religion). The use of this information can then by applied in order to understand client preferences. ClassificationUsed to classify clients into a number of predefined classes based on predetermined criteria. For example, clients may be classified as either very likely to leave the program or not very likely to leave the program. SequencingClient patterns are identified over a period of time. For example if a client, has increased the time between the purchase of contraceptives from her last visit, then that client will likely leave the program. Variation AnalysisA variation analysis looks for variations in a set of data and attempts to isolate any one factor that might influence a given measurement. Finding variations in data is important when you are trying to determine the cause of a particular problem. A variation analysis might provide the following analysis, "People living in the new government housing project have higher rates of respiratory illnesses than others." Comparison AnalysisUsing the preceding example, comparison analysis could be used to compare the number of cases of a certain type of illness in two different locations to help decide which one may need greater assistance or an increase in the amount of public health resources. Cause and Effect AnalysisA cause and effect analysis determines the effects of a given event. For instance, it's no surprise that a 30% percent increase in the number of clients in the waiting room severely reduces the quality of care given by the physician (spends shorter time with clinic, detracted, etc), but to what extent may be unknown. It is just as likely that this increase in the number of clients causes other less obvious problems. A cause and effect analysis would reveal these effects. Trend AnalysisA trend analysis looks for changes in the value of a measure over specific periods of times. For example, is the increase in the number of clients in the clinic a reflection of "good business" or does it reflect a "epidemic" in the target location. Deviation AnalysisDeviation analysis is one outcome of trend analysis. Deviation analysis identifies data that falls outside the norm of expected value. The key to identifying unusual data is to have an established baseline for that target population. PredictionsPredictions about the future can be made by allowing the data-mining engine to compare a set of inputs to the existing patterns in the database (prediction model). Unlike descriptive discovery, which is designed to find patterns and help understand data, predictive modeling uses the patterns to make a best guess on the values for new data sets. Although it is important to give every client the "correct" quantity of care, in order to speed up the servicing of clients, it may be necessary to establish a "risk factor" to determine the amount of time that the physician should spend with the client. All clients are allocated risk points, and if the sums of an client's risk points meet or exceed a predetermined threshold, the physician will spend an extra amount of time on that client. The data-mining model that emulates this point system can be created such that when a client is compared to the performance of past and current clients, the physician is help in making his decision.
Descriptive InformationDescriptive information draws conclusions about past events. It is to help understand the cause and effect relationships between elements of data. There are six ways to use data mining to gain knowledge about past events. Influence AnalysisAn influence analysis determines how factors and variable impact an important measure such as illness. For instance, a physician might want to know what factors contribute to the likelihood of the client complying with his instructions. There are many variables that can affect the success, examples are:
|
|
|