Sunday, September 22, 2019

Homepage: Python Study Notes

Study notes 1: Python general, dataframe, SQL, Plot

How do we install packages in python?
How do we send emails in python?
How do we load/output csv file in/to Pytyon
How do we read/load R datasets in/to Python
How do we subset dataframe in Pytyon
How do we use simple SQL to python dataframe
How do we download data from netezza?
How do we download data from amazon redshift?
How do we upload data from python onto amazon redshift?
How do we include a seprate piece of python code to run?
How do we convert a list to dataframe into dataframe?
Local vs. Global Variables
How do we get outliers from python?
Internal studentized residual VS external studentized residual
How do we scatter plot in python?
How do we drop big dataframe and release memory in python?

  Study notes 2: F-score, Random Forest,L1, L2,Gradient boosting

What's F score? What's precision/recall?
Why is called "Random" "Forest"?
How do we scatter plot in python?
Adaptive boosting VS Gradient boosting?
Gradient boosting VS Random Forest?
Adaboost VS Gradient boosting VS XGBoost
What is python generator/yield?
L1-norm vs L2-norm in machine learning
Gradient Descent VS Stochastic Gradient Descent
What's deep neural network? Why called deep?
How do we overcome local minima issue?
What's softmax? Where is the soft coming from?
What's KNN(k-nearest neighbor) vs K-means clustering

  Study notes 3: Most useful package/library, dummy coding, sparsity matrix, generator

Common useful package/library/module for machine learning
Using Dummy Coding for categorical varaibles in Regression example
How do we use sparsity matrix to save memory for large design matrix
How do we use generator to load huge data/design matrix gradually

  Study notes 4: Issues/Errors/Warnings Collection  

How do we get rid of NaN and inf values/rows?
Spyder is already running, restart kernel failed
Spyder IDE not opening because of problems with spyder.lock
Github error: src refspec remotes/origin/ matches more than one
Github error: Ambiguous object name: 'remotes/origin/'

  Study notes 5: Laymen steps to start git work

What's SSH/Key? How do we generate SSH key for github?
What's the preparation work before git?
What's the laymen steps to start git work

  Study notes 6: Spark SQL session example

PySpark Dataframe tutorial example

  Study notes 7: Python programming -7: tensor flow examples

What's tensor flow in layman language?
What's NIST/MNIST/Fashion MNIST database?

  Study notes 8: Python programming -8: AWS Sagemake/S3 Study Notes
What's Amazon Sagemaker? How to do training job inside?

  Study notes 9: Python/SQL Tips/Tricks

What's the difference between Union and Union All
What's the mutable and immutable in Python?

  Study notes 10: Plot all kinds of graphs in Python
Plot all kinds of graphs in Python

How do we simply plot by histogram?
How do we plot with 2nd y axis?
How do we scatter plot by group in python?

  Study notes 11: Google Cloud Platform(GCP)

What's the product matching between AWS and GCP

  Study notes 12: Tutorial on AWS lambda function

AWS lambda function to schedule batch transformed job

  Study notes 13: NLP study notes and tutorial

  Study notes 14: Hive HUE error collection

  Study notes 15: Darknet/Tensorflow (Darkflow)/YOLO tutorial

No comments:

Post a Comment

Data Science Study Notes: recommendation engine notes 1: Deep matrix factorization using Apache MXNet

Deep matrix factorization using Apache MXNet ( notes from Oreilly , github notebook ) Recommendation engines are widely used models th...