The purpose of the thesis was to study data analysis and the steps and software involved in it. The practical part contains a development process of a simple data analysis pipeline. Idea was to form a clear picture of what type of process does take place and what variation there might be between projects. Another angle of the thesis was to find out how much experience with the software involved or coding is required to get started. The thesis was not commissioned by an organization and was done because of individual interest in the subject. The knowledge base of thesis consisted of course lessons taken during the degree program. The theoretical makes up most of the thesis. During this part the steps involved in data analysis and methods involved in in them. The second part of the theory goes through some of the software used in data analysis. Research was mainly conducted by going reading several articles regarding the subject to form firstly a base for what is involved. After this further research was conducted on to the individual steps and processes. The development work also provided insight of what was initially overlooked during the research step. The results of the thesis provided a clear insight into how a simple data analysis pipeline could function. It also gave insight into how complex the process would be and how hard it would be for an outsider to start getting into. It should be taken into account that the scale and complexity of the development work was however on the smaller side. It would be beneficial to expand upon the work to get knowledge about more methods and how they function in bigger projects.
Data-analysisdata miningdata processingSoftware Development
Hame University of Applied Sciences
Mikael Virtanen