23 March 2023 | Posted by angela.tuduri
The era of Big Data: talking with Joan Navarro
39% of companies suffer from data analytics skills shortages - where do we go from here?

Big Data is becoming a hot topic, and its impact on industry and society is already undeniable. Every day, the amount of available data increases. Data that translates into valuable information and with which we can impact any decision for improvement.
Dr. Joan Navarro, professional expert in analysis and processing of massive data and coordinator of the Master's Degree in Big Data at La Salle-URL, gives us a preview of the trends and issues most in demand in the sector.
Don't miss it!
Big Data in the business world
Our devices, the digital footprint and the traces we leave when surfing the Internet provide us with information that, properly stored, analyzed and processed, helps us to improve business decision making.
Being able to study and obtain relevant information about the business is probably one of the greatest technological advances developed to date.
*/
/*-->*/
This large volume of data cannot be ignored, it has to be exploited.” - Joan Navarro, coordinator of the Master's Degree in Big Data EngineeringIn order to have advanced and individualized knowledge of our customers, improve the operational efficiency of our company and/or achieve new opportunities based on the systematic analysis of data, specific infrastructures and specific technologies are required to work with large volumes of data (Big Data). The latest studies of the labor market indicate an increase in the need to include profiles trained in the design and operation of highly scalable systems highly scalable systems that allow the analysis and extraction of large volumes of data. But how do we process this large amount of data in order to achieve this? The data lifecycle - a continuous process of cleansing and analysis The data lifecycle is the set of stages that the data engineer goes through from obtaining the information until it is processed and exploited. */ /*-->*/
By storing them, processing them, analyzing them and treating them properly, we can have an impact on improving business decision making” - Joan Navarro, coordinator of the Master's Degree in Big Data EngineeringIn the process, each stage is analyzed in terms of quality, security, privacy and access, among others, to ensure the nature and reliability of the information and improve business decision making.
- Data acquisition: Data can be generated by any digital device: smartphones, social networks, commercial transactions, surveys.... And/or they can also be acquired by external sources, such as public or private databases, data providers or third-party services. This phase requires the use of technologies that allow linking the data generation sources with the storage and processing infrastructure.
- Data storage: Once captured, data is stored in different types of infrastructure (e.g., NoSQL databases, data lakes hosted in public and/or private clouds) to facilitate and optimize its further processing. This storage must be secure, scalable and accessible to the users who process it.
- Data processing: In this phase, data is converted into useful information by using highly scalable techniques for parallel and distributed data processing. As a result of this phase, the data-now information-is ready to be displayed in an intelligible form through dashboards or high-level reports. This processing can be done in real time, in batches, or by continuous scanning. This requires creating a technological ecosystem that is capable of processing large volumes of data.
- Data analysis: In this phase, valuable information from the data is systematically analyzed using statistical techniques, data mining, machine learning and predictive analytics. The results of this analysis will allow fact-based decisions to be made, thereby improving and identifying opportunities or solving problems.
- Data distribution: Finally, once analytics are available, they can be used for business analysis, and leveraged in decision making. Managing access permissions and data privacy are critical at this stage.
- Infrastructure - There are currently many service providers (Google Cloud, Microsoft Azure, Amazon Web Services, ...) in the cloud (cloud computing) that offer a wide range of services to store and process large volumes of data.
- Data processing - To process data while making the most of the infrastructure that stores it, there are several alternatives such as Hadoop, Apache Spark, Disco, Hydra...
- Business Intelligence - Again, we must choose the tool that best fits the business needs, and in this case, there is a strong dependence on the technologies already deployed in the company. There are different tools such as Power BI, Oracle Business Intelligence, Tableau, Qlik...
- Analytical - At this stage there are well-established tools such as R, SAS, or Matlab. For the analysis of large volumes of data there are specific libraries (e.g., tensorflow, pytorch) that allow to exploit the computational performance of the system (GPUs) .
- Data architect profile, specialist design and management of data infrastructure with the objective of ensuring that data is available, secure, well organized, easy to access and use, and integrates effectively with other business applications and systems.
- Data analyst profile, specialist in efficient and scalable data processing for data organization, cleansing and analysis.
- Business intelligence consultant profile, specialized in the design, implementation and management of technological solutions that allow the presentation of business information for strategic decision making.
- Data scientist profile: specialist in the modeling of large complex data sets using advanced analytics and machine learning techniques.
- Data engineer: this type of profile is a specialist in the design, construction, maintenance and optimization of data management and processing systems. Therefore, he/she has the necessary knowledge to solve problems and add value in the 4 phases of the data life cycle.