The era of Big Data: talking with Joan Navarro
Big Data is becoming a hot topic, and its impact on industry and society is already undeniable. Every day, the amount of available data increases. Data that translates into valuable information and with which we can impact any decision for improvement.
Dr. Joan Navarro, professional expert in analysis and processing of massive data and coordinator of the Master's Degree in Big Data at La Salle-URL, gives us a preview of the trends and issues most in demand in the sector.
Don't miss it!
Big Data in the business world
Our devices, the digital footprint and the traces we leave when surfing the Internet provide us with information that, properly stored, analyzed and processed, helps us to improve business decision making.
Being able to study and obtain relevant information about the business is probably one of the greatest technological advances developed to date.
This large volume of data cannot be ignored, it has to be exploited.”
- Joan Navarro, coordinator of the Master's Degree in Big Data Engineering
In order to have advanced and individualized knowledge of our customers, improve the operational efficiency of our company and/or achieve new opportunities based on the systematic analysis of data, specific infrastructures and specific technologies are required to work with large volumes of data (Big Data). The latest studies of the labor market indicate an increase in the need to include profiles trained in the design and operation of highly scalable systems highly scalable systems that allow the analysis and extraction of large volumes of data.
But how do we process this large amount of data in order to achieve this?
The data lifecycle - a continuous process of cleansing and analysis
The data lifecycle is the set of stages that the data engineer goes through from obtaining the information until it is processed and exploited.
By storing them, processing them, analyzing them and treating them properly, we can have an impact on improving business decision making”
- Joan Navarro, coordinator of the Master's Degree in Big Data Engineering
In the process, each stage is analyzed in terms of quality, security, privacy and access, among others, to ensure the nature and reliability of the information and improve business decision making.
Data acquisition: Data can be generated by any digital device: smartphones, social networks, commercial transactions, surveys.... And/or they can also be acquired by external sources, such as public or private databases, data providers or third-party services. This phase requires the use of technologies that allow linking the data generation sources with the storage and processing infrastructure.
Data storage: Once captured, data is stored in different types of infrastructure (e.g., NoSQL databases, data lakes hosted in public and/or private clouds) to facilitate and optimize its further processing. This storage must be secure, scalable and accessible to the users who process it.
Data processing: In this phase, data is converted into useful information by using highly scalable techniques for parallel and distributed data processing. As a result of this phase, the data-now information-is ready to be displayed in an intelligible form through dashboards or high-level reports. This processing can be done in real time, in batches, or by continuous scanning. This requires creating a technological ecosystem that is capable of processing large volumes of data.
Data analysis: In this phase, valuable information from the data is systematically analyzed using statistical techniques, data mining, machine learning and predictive analytics. The results of this analysis will allow fact-based decisions to be made, thereby improving and identifying opportunities or solving problems.
Data distribution: Finally, once analytics are available, they can be used for business analysis, and leveraged in decision making. Managing access permissions and data privacy are critical at this stage.
The data lifecycle is a continuous process that must be carefully managed at each stage in order to get the most out of the data available.
Perspectives in the era of Big Data - Joan Navarro
The importance of being able to know and analyze the data provided by the digital footprint that we generate as individuals in society is growing more and more. Companies struggle to get the information and thus adapt to new market needs.
In this post we talk to Joan Navarro, Big Data expert and coordinator of the Master's Degree in Big Data Engineering (Big Data) at La Salle-URL, where he gives us a preview of the challenges and tools that facilitate data analysis.
What are the biggest challenges in handling and managing large data sets, and how are they addressed?
The first challenge for a company is to know the tools it really needs to solve its challenges. We must know what technologies are available on the market and what their strengths and weaknesses are. Many times, companies are faced with an immense number of alternatives that can be overwhelming. Therefore, it is necessary for organizations to evaluate their needs in order to invest in the tool that best suits their needs.
We also need to be aware of the amount of data we can get, so we can design the best strategies to ensure its security and privacy, as well as its storage and processing.
Designing the right infrastructures to work with Big Data and maintaining them over time-applying constant updates and modernizing them according to the incessant technological progress-is also one of the great challenges today. Many of the technological tools available today are not yet mature, and there are often incompatibilities when it comes to integrating them with other systems. It is therefore essential to have specialized and constantly trained personnel.
How are Big Data tools being used to analyze and predict trends in different industries?
The technologies and tools we use in each of the phases of the data cycle will be different, depending on the objective and functionality. For example:
- Infrastructure - There are currently many service providers (Google Cloud, Microsoft Azure, Amazon Web Services, ...) in the cloud (cloud computing) that offer a wide range of services to store and process large volumes of data.
- Data processing - To process data while making the most of the infrastructure that stores it, there are several alternatives such as Hadoop, Apache Spark, Disco, Hydra...
- Business Intelligence - Again, we must choose the tool that best fits the business needs, and in this case, there is a strong dependence on the technologies already deployed in the company. There are different tools such as Power BI, Oracle Business Intelligence, Tableau, Qlik...
- Analytical - At this stage there are well-established tools such as R, SAS, or Matlab. For the analysis of large volumes of data there are specific libraries (e.g., tensorflow, pytorch) that allow to exploit the computational performance of the system (GPUs) .
What are the emerging trends in Big Data analytics, and what profiles are companies asking for?
The main emerging trend is to capture as much data as possible. Even if they cannot be exploited at the moment, the main thing will be to have the technology and infrastructure that will allow us to store them.
Companies need suitable profiles that can respond to this need. Technological progress is indisputable, and we must be prepared. There are currently two main types of profiles in this field:
Vertical profiles - e specialists focused on each of the 4 phases of the data lifecycle.
Data architect profile, specialist design and management of data infrastructure with the objective of ensuring that data is available, secure, well organized, easy to access and use, and integrates effectively with other business applications and systems.
Data analyst profile, specialist in efficient and scalable data processing for data organization, cleansing and analysis.
Business intelligence consultant profile, specialized in the design, implementation and management of technological solutions that allow the presentation of business information for strategic decision making.
Data scientist profile: specialist in the modeling of large complex data sets using advanced analytics and machine learning techniques.
Horizontal profile -
Data engineer: this type of profile is a specialist in the design, construction, maintenance and optimization of data management and processing systems. Therefore, he/she has the necessary knowledge to solve problems and add value in the 4 phases of the data life cycle.
Continuing education in Big Data and data analysis - La Salle-URL
The data technology sector is booming. The demand for qualified profiles continues to grow and companies require experts capable of analyzing and drawing conclusions from data. According to a study published by McKinsey, the Big Data field is expected to generate more than 2 million jobs -in the US- in the coming years.
To work in the Big Data sector, specific training is required in areas such as statistics, programming, computer engineering, artificial intelligence, among others, and employment opportunities are found in different areas such as banking, marketing, health...
If you are thinking of boosting your professional career in a fast-growing sector, La Salle-URL has the Master's Degree in Big Data Engineering, designed for students looking to start their career in the field as well as for professionals who want to update their technological skills.
With the master's degree, you will cover the technologies used throughout the data lifecycle -infrastructure, software, data extraction and exploitation, mining...- opening a new range of job opportunities in the sector.
Discover La Salle-URL!
MASTER OF SCIENCE IN BIG DATA ENGINEERING