Notes in Machine Learning with Python(2)

发表于 2016-11-30 | 更新于 2017-05-05 | 分类于 Machine Learning

Notes in Machine Learning with Python(2)

Environment Problem

numpy and scipy
normal ones:pip install pandas quandl sklearn numpy matplotlib
pythonprogramming.net
github

Regression

python list : xs*ys means every element in xs times ys by the order of index

Matplotlib

plt.scatter() —-> scatter

Classification

K Nearest Neighbors Application

dataset

numpy.reshape :

example_measures


## a：将要被重塑的类数组或数组

## newshape：整数值或整数元组。新的形状应该兼容于原始形状。如果是一个整数值，表示一个一维数组的长度；

## 如果是元组，一个元素值可以为-1，此时该元素值表示为指定，此时会从数组的长度和剩余的维度中推断出

`

lib - warnings :
is set to a value less than total voting groups!')
1
`

numpy.linalg.norm : np.linalg.norm(np.array(features) - np.array(predict))

python dictionary:

dataset = {'k':[[1,2],[2,3],[3,1]], 'r':[[6,5],[7,7],[8,6]]}
  new_features = [5,7]
  for group in dataset:
      for features in data[group]:
          euclidean_distance = np.linalg.norm(np.array(features) - np.array(predict))
          distances.append([euclidean_distance, group])

Lib - Counters : from collections import Counter
- vote_result = Counter(votes).most_common(1)[0][0]
- It gives us a list of tuple,the ‘1’ in here determines the numbers of the most common tuples
- tuples:(the most common element,numbers of the most common)

use the [-num] of list flexbily

test_size = 0.2
  train_set = {2:[], 4:[]}
  test_set = {2:[], 4:[]}
  train_data = full_data[:-int(test_size*len(full_data))]
  test_data = full_data[-int(test_size*len(full_data)):]

  for i in train_data:
      train_set[i[-1]].append(i[:-1])

  for i in test_data:
      test_set[i[-1]].append(i[:-1])