Skip to content
Menu
Shark College
Shark College
Big Data Analytics

Big Data Analytics

May 3, 2022 by B3ln4iNmum
Students ID Numbers KIJ19472890,

PAV20490894,

AssignmentTutorOnline

COR20490899.

Academic year and term  

2021/22 – Semester – 1– Year 2

Module Title Big Data Analytics

 

Module Code  QAC020N255S

 

Module Convenor  Masum Billah
Academic Declaration:                                              

Students are reminded that the electronic copy of their essay may be checked at

any point during their degree, with Turnitin or other plagiarism detection software for plagiarized material.              

Word Count   Date Submitted   6/04/2022
       

 

KIJ19472890_PAV20490894_COR20490899_QAC20N255S

Big_Data_Analytics_Assignment_1

 

 

 

 

 

 

 

 

Table of Contents

 

1: Introduction                                                                                             2

2: Necessary definitions and concepts                                                         3

3: Assignment task                                                                                       9

  1. References and Bibliography 22

 

 

 

 

 

 

 

 

 

 

 

 

  1. 1. Introduction

 

 

Big Data Analytics– assumes work on a huge amount of data, using data management techniques and modern machine learning, obtained from many different sources, collected for the purpose of the processing. This data can come from transaction records, sensors, videos, images, social networks and many others.[1]

The concluded analyzes is aimed at discovering certain regularities and patterns which, due to the huge amount of data, may be invisible, and which, after appropriate processing, will become a valuable source of information used in many fields. In business for example, it helps to gain an advantage over the competition, in the form of avoiding mistakes, repairing the strategies or tactics used, and predicting risks and actions in the near future.

 

 

 

 

  1. 2. Necessary definitions and concepts

 

 

  1. Data Lakes.

Data Lake- defines as a storage repository collecting huge amounts of structured, unstructured and semi-structured data until its needed for analytics applications. It uses flat architecture for storing date, unlike traditional data warehouses based on hierarchical dimensions and tables. Data is stored in its original format with no restrictions on size or source.

There is specific lifecycle describing the individual stages, starting from data acquisition, through common infections through frameworks, storing data in a repository and create machine learning model trainings for advanced analytics. the next step is to build dashboards which help to make decisions on new projects. Another step is to build an application to help analyze.

 

Data Lakes in Automotive Industry– are an ideal tool for cooperation with the automotive industry due to the variety of data on the chemical structure of materials, their strength affectedness of fuel mixtures or ecological drives used. experts working in research laboratories deliver the results of their experiments to engineers working in the design department. engineers test prototypes and put them to work in unfavorable conditions. the decisions are made regarding the use of individual components.

The best Data Lakes available on market:

  • Azure Data Lake Storage.
  • Infor Data Lake.
  • AWS Lake Formation.
  • Intelligent Data Lake.

 

 

 

 

 

 

 

  1. In-memory Databases.

 

In-memory Databases– are intentionally built databases rely mainly on memory for data storage. Unlike databases that store data on disks or servers, In-memory Databases are faster because access to data is reduced to a minimum, thanks to the lock of neat to access the disc.

In-memory Database does not use traditional disk drives for data storage, the computers RAM memory or main memory is used. Data does not need encryption and it’s loaded in compressed and non-relational format.

Best options available on market:

Both Redis and Cassandra are NoSQL Databases.

Redis in fact is In-memory data storage, used as a database, supporting different types of data.

Cassandra is an open-source NoSQL database management system using linear scalability, providing high performance.

In-memory Databases in Automotive Industry- the future of the automotive market lies in intelligent cars. each of the journey will generate huge amounts of data, interacting with other road user’s cars or devices controlling traffic. Quick access to data is fundamental, as the response to external threats must be a few milliseconds. Throughput and latency troubleshooting are the biggest challenges facing car designers and in the memory databases.

 

 

  1. Streaming Analytics.

Streaming Analytics- is a cloud based service and real-time event analytics and processing engine. It’s designed to analyse and process huge amounts of data streamed simultaneously from many sources. It is relatively inexpensive as the organizations using it are built for the streaming units used. unlike traditional analytics tools that use data stored on disks or servers, streaming analytics deals with data that is in motion and in real time.

 

Best options available on market:

  • Azure Stream Analytics (Microsoft)
  • Altair- advanced data mining and predictive analytics.
  • Amazon Web Services
  • Oracle
  • Rapid Miner
  • IBM
  • Google Cloud Dataflow

Streaming Analytics in Automotive Industry- as we know, safety is a key aspect in the Automotive industry. The volume of processed data and the speed of individual operations must be compatible with high precision and reliability.

The automotive industry is dealing with the growing amount of data generated by electric vehicles, traffic services, connected cars, connected in the 5G network.

The above-mentioned factors also give the possibility of innovation in many fields design development analytics of the test vehicles as well automation.

Some platforms provide very short response time which enables users to perform complex calculations and interactive visualizations.

 

 

 

 

  1. Edge Computing.

 

Edge Computing- is a network philosophy based on the maximum attempt to bring the processing mechanisms closer to the data source. the idea behind this is to generate more network bandwidth while reducing delays. this is related to the lounge and use of a smaller number of cloud processes, and they’re transferred to local devices. (Edge servers, user computers, IoT devices)

Thanks to this concept applications run closer to users either on network edge or on the device.

Edge Computing in Automotive Industry- On the example of a modern car, equipped with many sensors and devices, connected via cloud to a database.

large amount of transferred material consumes bandwidth and the server in the cloud is also busy. shifting the sensors to the edge of the network would greatly simplify the operation of many devices shortening the distance over which information would have to be transmitted. if each sensor could use its internal computer to open applications and send data to the cloud instead of being always on, this would greatly reduce the use of bandwidth as the amount of data transfers would be limited.

 

 

 

 

  1. Artificial Intelligence.

Artificial Intelligence (AI)- as opposed to natural intelligence, presented by animals and humans, artificial intelligence is demonstrated by machines.

“Leading AI textbooks define the field as the study of “intelligent agents”: any system that perceives its environment and takes actions that maximize its chance of achieving its goals.”

According to other definition, Artificial Intelligence is development of computing systems, performing tasks requiring human intelligence.

Activities such as: speech recognition, visual perception, translations between languages or decision making.

 

 

Figure1. Types of in-vehicle sensors.

 

 

Artificial Intelligence in Automotive Industry- the automotive industry has long used the properties and capabilities of artificial intelligence.

Design- AI has long been used in to design process when assessing the risk and damage of vehicle.

Production- manufacturers use ML algorithms and AI solutions to achieve more efficient production rates.

Supply Chain- the key issue is the supply of appropriate components and the ability to monitor their delivery. therefore, the modern supply chain relies on IoT, Blockchain and AI.

Quality Control – AI can analyze vehicle data more efficiently and detect breakdowns and technical problems.

User Experience- AI makes sure passengers are safe and happy. manufacturers introduce various sensors, including recognition of emotions with facial expressions or sound recognition, Amazon AI-driven Alexa.

Driver Assistance Systems- are to ensure safety by warning about weather changes, changes in the road traffic, offering assistance in route selection.

 

 

 

 

 

  1. Apache Spark.

Apache Spark- next generation, easy to use technology for real time data streaming. An open-source distributed data processing system, using in the memory caching an optimized query execution for quick analytical queries on data of any size. It’s also compatible with API interface in Python, Scala, Java and R.

Also provides the following capabilities:

  • interactive queries,
  • real time analysis,
  • graph processing,
  • machine learning,
  • batch processing.

 

Apache Spark in Automotive Industry- with the development of the automotive industry and the shift to electric cars, engineers face great design challenges. there are accompanied but Advanced Research on materials and the performance of individual elements. those deciding on the future strategy must have a credible source of information, based on transparent data. Apache Spark is an ideal tool that can coordinate the Activities of many departments of the company.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. 3. Assignment task

 

The second part of our work is the practical use of the skills acquired during the classes. add visual analytics platforms on the market, Tableau has been proposed.

Tableau it is relatively easy to use data visualization tool, requiring no advanced programming skills as it is based on an easy drag and drop interface.

Question 1.

 

Figure 2. Diagram showing analyse question 1.

 

The first question was to fight a manufacturer that offered the smallest number of models. It is clear we are talking about Aston Martin with 8 models released, ahead of Subaru (12 models), Mitsubishi (13) and Honda (14).

To complete the analysis, it should be stated that General Motors is the undisputed leader, which has brought 127 models to the market. BMW is not much behind it with 119 models released.

 

Figure 3. Diagram showing analyse question 1.

 

 

To find the correct answer, it was enough to use count function for the division category and the filter for Carline.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Question 2.

 

The second question concerns driving economy and fuel economy.

Because of its complex nature, it must be broken down into two steps, each for urban and motorway fuel economy.

 

2.1 Highway Fuel Economy

Figure 4. Diagram showing analyse question 2.1

 

 

The Data shows that for each model the exact combustion was calculated, both in the urban area and on the motorway. (Highway Fuel Economy and City Fuel Economy)

In order to present the results in a clear graphic form, it is necessary to average the combustion for each manufacturer. (Filter field Average for Highway)

 

 

 

 

Figure 5. Diagram showing analyse question 2.1

 

 

The chart above shows that Mazda has the best fuel economy record. The 34.25 value it is by far the best and is more than 1.6 units ahead of Mitsubishi. (32.62)

Hyundai is in 3rd place (31.50) slightly ahead of Honda. (31.07)

Unfortunately, the units in which the measurement results were presented were not given, which means that our analysis does not look very professional from a scientific point of view. Operating with numerical values without specifying units distorts the picture and makes it difficult to discuss the results of measurements making them less reliable.

For the sake of order, it should be stated that Aston Martin (19.63), Jaguar (24.00) and Mercedes-Benz (25.32) have the worst combustion parameters.

 

 

 

 

 

 

 

 

 

 

 

 

2.2 City Fuel Economy

 

The images below show the most important steps of the visualization and the final effect of our activities.

 

Figure 6. Diagram showing analyse question 2.2

 

 

As in the case of driving on the highway, the matter relates to the economy of driving in urban areas, only other data must be used to visualize. The method remains the same, the combustion should be averaged for each model. For this purpose, we use the filter and one of the proposed options. (Average)

 

 

Figure 7. Diagram showing analyse question 2.2

 

 

The Results of the fuel economy analysis in built-up areas are very similar to those regarding highway. The Of the best producers in this aspect looks similar.

Mitsubishi (25.46) presented the best fuel economy parameters, slightly ahead of Mazda. (25.17) These two brands have two or more units of advantage over the third in the ranking, Hyundai. (23.17)

With regards to the worst results, the last two positions not changed convert to driving on the motorway. Also, in the built up area, Aston Martin is the least economical producer (13.25).

Jaguar, occupying the penultimate position, presents slightly better parameters. (16.05)

 

 

 

 

 

 

 

 

 

 

Question 3.

 

The Third issue was to check which transmission type shows the highest and the lowest average fuel economy.

 

Figure 8. Diagram showing analyse question 3

 

 

All represent types of gears are divided into automatic and manual. In the case of automatic transmission, we could distinguish between 6 subcategories.

The question concerns average combined fuel economy, so the appropriate filter had to be used again.

 

 

 

 

 

Figure 9. Diagram showing analyse question 3

 

The best parameters for average combined fuel economy were shown by Continuously Variable type of gear. (31.64)

The results obtained allows it to expel the next two types in the transmission description subcategories by more than 4 units.

Selectable Continuously Variable (26.95) comes in second, slightly ahead of Automated Manual. (26.65)

Only the three above-mentioned transmission types are better than Manual (24.85), the other automatic transmission types have worse parameters.

The next one was the Semi-Automatic (23.37) and automated Manual Selectable (21.64).

The worst parameters are characterized by Automatic Transmission Type. (19.95)

Above list shows that the manual gearbox is classified in the middle of the rate, both in terms of the position statement and the value close to the average.

 

 

 

 

 

 

 

 

Question 4.

 

The fourth question is complex and in order to answer it we must select a producer who meets three conditions.

It must produce both 2-wheel drive and 4-wheel drive cars.

In addition, they must have an engine with the power of at least 3.5

 

 

Figure 10. Diagram showing analyse question 4

 

 

In order to properly analyze the drive description category, in the context of the given task, it was necessary to start by eliminating three of the six types of automatic transmissions by drop and drop to appropriate filter field and choosing correct boxes.

 

Figure 11. Diagram showing analyse question 4

 

 

The second stage of work, meeting another necessary condition, was to limit power to that assumed the question. Again, using the filter considering Engine Displacement and setting minimum value was necessary.

This allows for the elimination of cars with an engine power below 3.5.

 

 

 

 

 

 

 

Figure 12. Diagram showing analyse question 4

 

 

In the case of 2-Wheel Drive Front, we have individual values for each manufacturer. situation gets complicated in the case of 2-Wheel Drive Rear and 4-Wheel Drive.

Each manufacturer released several models with different engine power parameters. General Motors has produced six models and both Ford and Mercedes-Benz have launched four models.

 

It was now coincidence that I chose this type of graphics to visualize and answer to the problem posed and the question.

it perfectly shows the number of models and clearly describes the individual numerical values for the engine power of each model.

 

Figure 13. Diagram showing analyse question 4

 

 

The screenshot shown above with red dots (representing Chrysler) and green dots (representing General Motors) is the visualization of the answer to the question.

Only Chrysler and General Motors are represented in each of the three subcategories. the remaining manufacturers, despite many models and individual segments, did not meet the conditions, do not meet conditions set in the task. both the Chrysler and the General Motors released cars with 3.6 power for the 2-wheel drive front.

Chrysler has three representatives in the 2-wheel drive rear segment, with the most powerful model of 6.4 for the 8-cylinder Dodge Charger SRT8.

General Motors released as many as six models in this segment with a wide variety of engine power, with a maximum value of 7.0 for the 8-cylinder Chevrolet Camaro.

In the 4-wheel drive category, Chrysler has three representatives from the Jeep Cherokee SRT8 4×4 with the most powerful engine with a value of 6.4.

General Motors has produced two models in the 4-wheel drive category, and theK1500 YUKON 4WD model can boast the best performance, reaching 6.2 engine power.

 

 

 

 

 

Question 5.

 

 

Data analytics is becoming an indispensable branch of business, increasingly emphasizing its presence on the market. companies and organizations that want to develop and evolve activities must respond to changing trends and in order to stay ahead of the competition, they should implement mechanisms of predictive analytics to anticipate the market.

At a time when the amount of data is growing exponentially, traditional tools and data processing platforms are losing their relevance. In their place, technologies that allow for the storage and analysis of huge amounts of data from various sources seem to be entering.

skillful market analysis, based on the valuable data, is the key to understanding the regularity of rulers in the business world, but also in engineering and many other areas of life. It helps to isolate neglected areas and contributes to their improvement. Development strategies are created based on properly prepared analysis, thanks to predictive analytics.

Platforms for data analysis and their visualization are becoming more accessible and easier to use, affecting the company productivity and finances.

Blockchain technology creates a range of safe and accessible media for financial transactions.

 

However, data analysis requires an enormous amount of work to acquire and match valuable data from many sources. data from the wrong sources can lead to erroneous interferences.

The moral aspect causes that’s some of the ways of obtaining data becomes unethical and may violate privacy of people or the criminal law.

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. References and Bibliography

 

 

  1. Technopedia- www.techopedia.com/definition/28659/big-data-analytics 
  2. swisscognitive.ch/wp-content/uploads/2021/04/reflection-current-status-industry-40-with-ai-and-smart-manufacturing.pdf
  3. What is Apache Spark? | Introduction to Apache Spark and Analytics | AWS (amazon.com)
  4. Milligan, J., 2020. Learning Tableau 2020 – Fourth Edition. [S.l.]: Packt Publishing. 
  5. MongoDB. 2021. NoSQL vs SQL Databases. [online] Available at: [Accessed 23 March 2022].
  6. apriorit.com/dev-blog/728-ai-applications-automotive-industry 
  7. .cloudflare.com/en-gb/learning/serverless/glossary/what-is-edge-computing/

 

 

 

  • Assignment status: Already Solved By Our Experts
  • (USA, AUS, UK & CA PhD. Writers)
  • CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS
QUALITY: 100% ORIGINAL PAPER – NO PLAGIARISM – CUSTOM PAPER

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • AMN426 Assessment 2 Content CreationAssessment 2 Purpose
  • Mock Question
  • Software Development Fundamentals
  • Research Methods and Design
  • Career Viewpoint

Recent Comments

  • A WordPress Commenter on Hello world!

Archives

  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021

Categories

  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
©2022 Shark College | Powered by WordPress and Superb Themes!