Py) ML - k-최근접 이웃(kNN)

Py) ML - k-최근접 이웃(kNN)

Python에서 k-최근접이웃(k-Nearist Neighbor) 분석을 하는 방법을 알아보자.

분류

머신러닝 -> 지도학습

개념

분류 모델

회귀 모델

관련 수식

예제 코드

1
2
3
import pandas as pd
df = pd.read_csv("iris.csv")
df.head(2)
SepalLength SepalWidth PetalLength Petal.Width Species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa

분류 모델

1
2
3
4
5
6
7
8
9
10
11
from sklearn.neighbors import KNeighborsClassifier

model_clf = KNeighborsClassifier(n_neighbors = 3)
model_clf.fit(X = df.drop("Species", axis = 1), y = df["Species"])
## KNeighborsClassifier(n_neighbors=3)

pred_clf = model_clf.predict(df.drop("Species", axis = 1))
pred_clf[:4]
## array(['setosa', 'setosa', 'setosa', 'setosa'], dtype=object)

pd.crosstab(df["Species"], pred_clf)
col_0 setosa versicolor virginica
Species
setosa 50 0 0
versicolor 0 47 3
virginica 0 3 47

회귀 모델

1
2
3
4
5
6
7
8
9
from sklearn.neighbors import KNeighborsRegressor

model_reg = KNeighborsRegressor(n_neighbors = 3)
model_reg.fit(X = df.iloc[:, :3], y = df.iloc[:, 3])
## KNeighborsRegressor(n_neighbors=3)

pred_reg = model_reg.predict(df.iloc[:, :3])
pred_reg[:4]
## array([0.26666667, 0.2 , 0.16666667, 0.2 ])
1
2
3
4
5
from sklearn.metrics import mean_squared_error

mean_squared_error(y_true = df.iloc[:, 3],
y_pred = pred_reg)
## 0.01864444444444445
Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×