Researching Big Data Business Case Study Essay
Briefly define the term Big Data, what it is and how it is used.
Choose one of the following research topics related to Big Data.
Find articles that discuss the use of big data in your career field or a field in which you are interested. What benefits are being delivered? What are some of the hardware and software tools being employed? What complications are limiting the results of big data? Write a paragraph or two answering each of these questions.
Many executives still mistrust the results of big data and analytics. Find articles that address this issue and summarize the key reasons why. How might these issues be overcome?
IBM Watson is a question-answering computer system capable of answering questions posed in natural language. Find articles that appeal and develop a one-page report that either discusses how Watson is “trained” or an application of Watson of interest to you.
At least one article should come from a GU Library resource (find the library link at the top of the Glife page). Article link should be included in the paper. Remember, information should be restated in your own words. Any direct quotes should be properly credited to the source and quotes should not comprise more than 20% of the paper.
Each minute a person in the United States dies from cancer—over half a million deaths per year. Thousands of scientists and physicians are working around the clock to fight cancer where it starts—in our DNA.
DNA is a molecule present in our cells that carries most of the genetic instructions used in the development, functioning, and reproduction of all known living organisms.
The information in DNA is stored as a code made up of four chemical bases adenine (A), guanine (G), cytosine (C), and thymine (T). Human DNA consists of about 3 billion bases, and more than 99 percent of those bases are the same in all people. The complete set of DNA instructions is called your genome, and it comes packaged into two sets of chromosomes, one set from your mother and one set from your father. Sometimes those instructions are miscoded or misread, which can cause cells to malfunction and grow out of control—resulting in cancer.
Doctors now routinely use patient genetic data along with personal data and health factors to design highly personalized treatments for cancer patients. However, genome sequencing is a highly complex effort—it takes about 100 gigabytes of data to represent just a single human genome. Only a few years ago, it was not even feasible to analyze an entire human genome. The Human Genome Project (HGP) was an international, collaborative research program whose goal was the complete mapping and understanding of all the genes of human beings. The HGP took over 15 years and cost the neighborhood $3 billion, but the result was the ability to read the complete genetic blueprint for humans.
It takes a computer with powerful processing power and prodigious amounts of storage capacity to process all the patient data required to sequence their genome. Most researchers simply do not have the in-house computing facilities equal to the challenge. As a result, they turn to cloud computing solutions, such as the Amazon Web Services public cloud system. Thanks to cloud computing and other technical advances, sequencing of a human genome can now be done in about 40 hours at a cost of under $5000.
Researchers at Nationwide Children’s Hospital in Columbus, Ohio invented Churchill, a software application that analyzes gene sequences very efficiently. Using cloud computing and this new algorithm, researchers at the hospital are now able to analyze a thousand individual genomes over a period of a week. Not only does this technology enable the hospital to help individual patients, it also helps large-scale research efforts exploring the genetic mutations that cause diseases.
Using the cloud also enables doctors and researchers worldwide to share information and collaborate more easily. The Cancer Genome Atlas (TCGA) is a research program supported by the National Cancer Institute and the National Human Genome Research Institute, whose goal is to identify genomic changes in more than 20 different types of human cancer. TCGA researchers compare the DNA samples of normal tissue with cancer tissue taken from the same patient to identify changes specific to that cancer. The researchers hope to analyze hundreds of samples for each type of cancer from many different patients to better understand what makes one cancer different from another cancer. This is critical because two patients with the same type of cancer can experience very different outcomes and respond very differently to the same treatment. Researchers hope to develop more effective, individualized treatments for each patient by connecting specific genomic changes with specific outcomes.
Critical Thinking Questions
What advantages does cloud computing offer physicians and researchers in their fight against cancer?
Estimate the amount of data required to analyze the human genome of 100 patients for each of 20 different types of cancer.
Physicians must abide by HIPAA regulations when transmitting data back and forth to the cloud. The penalties for noncompliance are based on the level of negligence and can range from $100 to $50,000 per violation (or per record). Violations can also carry criminal charges, resulting in jail time. What measures can be taken when using cloud computing to ensure that patient confidentiality will not be violated?
Submit the assignment to Dropbox.
Remember, information should be restated in your own words. Any direct quotes should be properly credited to the source and quotes should not comprise more than 20% of the paper.
Choose one of the following research topics related to Enterprise Systems or eCommerce.
Locate an article on eCommerce systems and how they work. Describe how they might be implemented at either a large or small scale and the types of information systems that would need to be in place to support them.