This course is ideal for professionals with a variety of job descriptions across large range of industries, given the rising popularity and accessibility of data science. You'll need some prior experience with Python, with any prior work with libraries like Pandas, Matplotlib and Pandas providing you a useful head start.
- Identify potential areas of investigation and perform exploratory data analysis
- Plan a machine learning classification strategy and train classification models
- Use validation curves and dimensionality reduction to tune and enhance your models
- Scrape tabular data from web pages and transform it into Pandas DataFrames
- Create interactive, web-friendly visualizations to clearly communicate your findings
This course will require a computer system for the instructor and one for each student. The minimum hardware requirements are as follows:
- Processor: Intel i5 (or equivalent)
- Memory: 8 GB RAM
- Hard disk: 10 GB
- An internet connection
- Python 3.5+
- Anaconda 4.3+
Python libraries included with Anaconda installation:
- matplotlib 2.1.0+
- ipython 6.1.0+
- requests 2.18.4+
- beautifulsoup4 4.6.0+
- numpy 1.13.1+
- pandas 0.20.3+
- scikit-learn 0.19.0+
- seaborn 0.8.0+
- bokeh 0.12.10+
Python libraries that require manual installation:
- mlxtend
- version_information
- ipython-sql
- pdir2
- graphviz