1

Cloud Computing And Big Data

By

Student Name

University

Abstract

Big Data is referred to as methodology for the analysis of data that has been incorporated by the means of current technological advancements and architectures. But, the Big Data technology offers a variety of benefits in terms of resource processing and hardware resources in such a manner that the costs are considered reliable and easy to adopt for organizations with small mediums sized businesses. The cloud computing technology ensured the implementation of Big Data technology to the medium and small sized organizations. The core functions of Big Data are based on the special technology named as MapReduce. It is a specific algorithm in terms of programming. The MapReduce technology works with the storages devices attached in networks. They storages are accessed by the means of parallel processing. The MapReduce programming technology provides ample functions which may go beyond the scope of requirements in medium and small business organizations. Cloud computing is referred to as an on-demand network accessibility for resources of computing that are normally provided by the entity outside. There are several types of platforms that are used for the deployment of cloud computing that comprises of Platform as a Service (PaaS), hardware as a service (HaaS), Software as a service (SaaS), and infrastructure as a service (IaaS). Furthermore, there are three types of cloud computing services exists such as hybrid cloud, public cloud, private cloud. A private cloud is referred to as data center of a business. A private cloud remains available for internal accessibility internally. The general public cannot access such networks due to privacy. A public cloud is referred as cloud services that are accessible to the general public and can be accessed as pay as you go service. A hybrid cloud is referred to as the combination of a private and public cloud. However, there are several reasons due which the small-medium sized companies deploy the cloud computing and major three of them are cost reduction through processing, hardware and the ability to test the big data value. However, there are several draw backs associated with cloud computing and major fears are control loss and security.

Introduction:

Big Data is referred to as a methodology for the analysis which is based on the modern architectures and technologies to support the speedy storage, capturing of data and analysis of data as well. Data sources are not limited to the traditional databases as it also provides supports to mobile devices, social media, data generated by sensors and email. It is the fact that the data stored into the database not only stored in structured format but also in an unstructured format. Availability of big space is the requirement of big data. It is evident the costs of storages are decreasing day by day, but it can cause serious financial concern to the small medium enterprises. It is estimated that the typical infrastructure of analysis and storage will be based on the Network attached storages that are further divided into clusters. To configure the cluster of NAS infrastructure, numerous pods of NAS are integrated with the several storage devices. These devices are further connected with the NAS device. Furthermore, these NAS devices are further connected with each to enable the fast data sharing and research. The storage of data by the means of cloud computing is apprehended a feasible selection for medium and small companies that plan to deploy the analytic techniques of Big Data. It is evident that the cloud computing is apprehended as on-demand network access to the resources for computing and are normally provided by the outside entities and are managed by the business on an internal level. There are several deployment models and architectures can be seen for the purpose of cloud computing, these models and architectures can also be utilized by making an amalgam with other design methods and technologies. However, small and medium-sized organizations who intend to use the cloud computing technologies to cater their organizational data needs and cannot afford the clustered technology of NAS can also deploy other computing models for big data solutions. For this purpose, it is mandatory for small medium enterprises to select the appropriate model of cloud computing to remain profitable and competitive at the same time in the market. (Kling, 2014)

Currently, Big data is also apprehended one of the most advanced approaches in business now a day. Organizations perceive it as underlying opportunities in the context of the clinical field for brining the undetermined data into existence such that the data may be transformed in such a manner that the valuable insights may be observed. Furthermore, it provides companies to know about their useless data and determine the strengths and weaknesses in depth. The big data technology and many other cloud-based technologies not only aided the small medium business but also brought robust improvement to the clinical arena and healthcare field. Furthermore, the technology of big data will not only depict the fresh methods for the measurement of improvement in quality of care and the outcomes of patients. For medicine and healthcare, cloud computing has bright future due to ubiquitous nature. (Sosinsky, 2011)

Body:

The big data term is referred to the data that is unstructured and due to the nature of big data, the datasets cannot be stored into the database systems. Hence, it also makes it complicated to analyze them in such circumstances. It can be clearly seen that the traditional methods by the means of paper, analog, film, paperless and filmless are getting away from the healthcare and small to medium business due to the tsunami of data in various databases and systems in a hospital, and small to medium businesses. (Rountree, Castrillo, 2014) The number of servers gets increased and this increased number of servers is also referred to as the cloud. It has been observed that the digital contents have been grown with an exponential level. The growing data rate is 5.4 petabytes and the rate is getting double in the span of every 18 months. However, the increase in data has also been observed from the social media websites, biological data, medical data, and World Wide Web and genomic sequences. (Jaatun, Rong, 2009) Small medium business organizations and many other devices such as fitness, sensors, and medical devices are also found to be interconnected. However, the amount of data that is being produced by the internet is also referred to as Internet of Things (IoT). So, Big data may be referred to as big science from that perspectives. (Chorafas, 2011)

The big data is considered the data beyond the traditional database limits as no software would be able to quantify, manage, capture and store the data for analysis. With the passage of time, various technologies are being emerged to handle the massive data collection by placing them into the data stores. The term “Big Data” refers to the data with huge size that surpasses the ability to be stored in the traditional database and the manipulation of data by the means of traditional software to manage, store, capture and analyze. The term is not only associated with the manipulation and analysis of data but also depicts the emerging new technology that are being developed for the handling of massive data collections in databases. Big Data is also further referred to as an evolving approach that is utilized for the quantification of big data and can range from terabytes to peta bytes in a single data set. It has also been assumed that with the technological advancements, the big data span will also be increased. However, the amount of data in Big Data term can also have varying natures across the different industries and can also be apprehended in exabytes. One Exabyte is equivalent to the one quintillion bytes of data or can also be counted as one billion gigabytes or one thousand petabytes. It has also been estimated that the mankind has created up to 2 exabytes of data. (Molen, 2010)

There are several types of challenges with different nature can be imagined with the big data such as the processing and management of data within the timeframe that is tolerable and figuring out the issue and the time required for the resolution. Currently, in this technological era, the technologies are easily available for the processing and analysis of massive data stores. The big data technology provides the opportunities to reap the cost effective data management and data analysis for the purpose of meaningful data insights with flexibility and liberty. It makes the big data more interesting that the definition of data is getting bigger. The definition of big data was comprised of three “Vs” and now there are apprehended at least 2 “Vs”. According to the definition from Gartner, the definition of big data initially focuses the first three Vs i.e. high velocity, high volume and high information variety that ask for the innovative, cost effective forms along with the enhances information processing insights along with the making of decisions. It is evident that the data volume is increased astronomically. However, the data velocity gives the insight to an interesting aspect. (Therplan, 2011) The data is received with the higher number, speeds and transaction frequency. It is mandatory to understand big data velocity along with the synchronization and prioritization to meet the strategic demands of small-medium business operations. The variety aspect can be referred to as spice of data. There exist amazing data type arrays in industries and there are different types of data insights are being observed from geographical locations, text, and different types of the data set. Data is currently considered an organizational asset in rich form, a natural resource that is required to be captured. However, data analysts are also focusing on adding additions V aspects for the definition of big data. Another aspect added is called verification of big data. It has been argued that the data comes from different ranges and due the quality of data, security becomes another big concern for data integrity. During the process of extract, transform and load (ETL), the management of data in warehouses from the perspective of quality verification and conformity becomes an undeniable big data feature. (Leymann, 2011)

Value is referred to as another dimension and added as another V. It is considered that the derivation of value aspect from the analytics of big data can pave severe impact over the outcomes and process of data and also considered crucial for the determination of critical objectives in context of big data. The value aspect, in the context of small medium businesses and healthcare, is apprehended as one of the remarkable innovative gesture, growth, progress and new models for improvement by the means of valuable big data derivatives. The proliferation of big data brings the severe associated challenges into the system in many ways. In the healthcare industry, the information is often stored and does not require the interaction or communication between them. But, in the current era, the organizations have the rich data, but the information is very poor. It has also been observed that the clinicians have to navigate around the system for the detection of data to put the information pieces together. The healthcare industry has paced in terms of data collection, but the challenges are still the same. In healthcare or other small-medium businesses, the data can be categories in the form of unstructured data, structured data, streaming data, imaging data. However, the emergence of new technologies has also provided the discoveries of searches, indexes, and navigation of data amongst the diverse data sources. It has been observed that the small-medium organizations and healthcare industries have the massive data amounts in the data repositories and databases that are further fed to the data warehouse. Addition in the massive data amounts that is being caused due to the genomic sequence generation along with the equipment and devices, it can be apprehended easily that the “connect the dots” is becoming complex. It is evident that the running of deeply queries for analysis on the unstructured and structured volumes is a major problem. To cope with this problem, there are massive processing of data in parallel is required and appliances are also proposed for the deep analytics along with the capabilities for the processing by the means of natural language process (NLP). This approach is also apprehended perfect. But, big data is not assumed as data at rest as a significant amount of data is considered to be in motion. The streaming of data is another different concept for big data as the analysis has to be quick as the data is observed to be in motion. However, bigger progresses have been observed in this scenario as in ICUs, the correlating elements of data such as waveforms and hours in the live environment are considered the exciting examples. The traditional and big data methods are getting merged to handle the elements of data. In case of structured traditional approach with the analysis in repeatable manner, the big data method that is iterative in nature is utilized for the exploratory analysis. (May, 2011)

There is a merging of traditional and big data approaches to handling these data elements. If the traditional approach was structured and repeatable analysis, and big data approach is one of iterative. Big data provides the fluid stage for the discovery with creativity. The user is put at ease to find the dimensions and facets by the means of intelligent data insights. It gives the opportunity for scanning the massive data stores and the ability to make the connection with other data types and to further enable the user to find new underlying meanings and insights. The correlation of cost with data, performance outcomes, and after that correlating them with the guidelines based on evidences and best business practices can be revealed along with the opportunities and insights in order to ensure the continuously pushing of needling towards new models. There is a massive amount of computation is required to meet the data analysis of big data and it has now become accessible due to the provision of cloud technologies. For example, Watson at IBM required the expensive and massive services to go through the big data blobs. But, now the scenario is completely changed and these services are being rendered by the means of available cloud technologies. The cloud-based computations services deployment to the Watson as a service seems a real offer. However, this scenario can also be apprehended as true for various other cloud technology platforms such as Amazon Web Services and Hadoop. It has also been observed that this trend will continue to grow. There are several benefits associated with the cloud-based services which are igniting the trend such as elasticity, self service, on demand, high performance, and cost effectiveness. Through many ways, the conversion essence for the management of data has been transferred with the readily available capabilities and tools of big data. But, a new debate starts here as it is feasible for small medium business organizations and healthcare departments to afford the information storage or either put it away. It was believed that the massive amount of data cannot be processed and managed to determine the insightful metrics and quick responses. The focus has also been shifted towards the security, integrity and quality of information data in terms of management of information lifecycle management. (Murugesan, 2010)