<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>Finite</title>
    <link>https://finite-elem.tistory.com/</link>
    <description>Learning and sharing 
            my interests</description>
    <language>ko</language>
    <pubDate>Mon, 6 Jul 2026 06:56:40 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>elif</managingEditor>
    <image>
      <title>Finite</title>
      <url>https://tistory1.daumcdn.net/tistory/6758850/attach/22ec98092a604e339a3f1cdadb5d5cf5</url>
      <link>https://finite-elem.tistory.com</link>
    </image>
    <item>
      <title>173_Time Series Forecasting with LSTMs(2)</title>
      <link>https://finite-elem.tistory.com/175</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;Continuing from the previous post(&lt;/span&gt;&lt;a href=&quot;https://finite-elem.tistory.com/174&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;172_Time Series Forecasting with LSTMs&lt;/a&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;&lt;/span&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;)&lt;/span&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716444971163&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def train_model(
  model, 
  train_data, 
  train_labels, 
  test_data=None, 
  test_labels=None
):
  loss_fn = torch.nn.MSELoss(reduction='sum')

  optimiser = torch.optim.Adam(model.parameters(), lr=1e-3)
  num_epochs = 60

  train_hist = np.zeros(num_epochs)
  test_hist = np.zeros(num_epochs)

  for t in range(num_epochs):
    model.reset_hidden_state()

    y_pred = model(X_train)

    loss = loss_fn(y_pred.float(), y_train)

    if test_data is not None:
      with torch.no_grad():
        y_test_pred = model(X_test)
        test_loss = loss_fn(y_test_pred.float(), y_test)
      test_hist[t] = test_loss.item()

      if t % 10 == 0:  
        print(f'Epoch {t} train loss: {loss.item()} test loss: {test_loss.item()}')
    elif t % 10 == 0:
      print(f'Epoch {t} train loss: {loss.item()}')

    train_hist[t] = loss.item()
    
    optimiser.zero_grad()

    loss.backward()

    optimiser.step()
  
  return model.eval(), train_hist, test_hist&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The code below defines a function to train the given model. The loss function used is &lt;/span&gt;&lt;b&gt;MSELoss&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;, and the optimization algorithm is &lt;b&gt;Adam&lt;/b&gt; with the number of epochs set to 60. The &lt;/span&gt;for&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; loop iterates through the specified number of epochs, initializing the model's hidden state, comparing predictions to actual values to calculate the loss. If test data is provided, it calculates and stores the test loss, printing it every 10 epochs. The training loss is stored, and backpropagation is used to compute the gradients and update the weights.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716445990895&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;model = CoronaVirusPredictor(
  n_features=1,
  n_hidden=512,
  seq_len=seq_length,
  n_layers=2
)
model, train_hist, test_hist = train_model(
  model,
  X_train,
  y_train,
  X_test,
  y_test
)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;547&quot; data-origin-height=&quot;108&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/OMeWo/btsHAx1Twfi/RvpUnXqrkUTKPQHrhe1iVK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/OMeWo/btsHAx1Twfi/RvpUnXqrkUTKPQHrhe1iVK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/OMeWo/btsHAx1Twfi/RvpUnXqrkUTKPQHrhe1iVK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FOMeWo%2FbtsHAx1Twfi%2FRvpUnXqrkUTKPQHrhe1iVK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;547&quot; height=&quot;108&quot; data-origin-width=&quot;547&quot; data-origin-height=&quot;108&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;pre id=&quot;code_1716446028942&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;plt.plot(train_hist, label=&quot;Training loss&quot;)
plt.plot(test_hist, label=&quot;Test loss&quot;)
plt.ylim((0, 0.01))
plt.legend();&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1165&quot; data-origin-height=&quot;812&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bgsAgZ/btsHyUqIg1H/xqXYflVaFVYt5ydIeXtjb0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bgsAgZ/btsHyUqIg1H/xqXYflVaFVYt5ydIeXtjb0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bgsAgZ/btsHyUqIg1H/xqXYflVaFVYt5ydIeXtjb0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbgsAgZ%2FbtsHyUqIg1H%2FxqXYflVaFVYt5ydIeXtjb0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;640&quot; height=&quot;446&quot; data-origin-width=&quot;1165&quot; data-origin-height=&quot;812&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The result data appears a bit unusual. Ideally, the training loss should be lower than the test loss, but regardless of how small the value, the test loss does not decrease and seems to oscillate. It seems this issue arises because the number of time series data points is too small.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The current model can only predict the first future value. Therefore, to predict multiple future values, we can use the predicted value as the input for the next day's prediction, continuing this process iteratively.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716466351830&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;with torch.no_grad():
  test_seq = X_test[:1]
  preds = []
  for _ in range(len(X_test)):
    y_test_pred = model(test_seq)
    pred = torch.flatten(y_test_pred).item()
    preds.append(pred)
    new_seq = test_seq.numpy().flatten()
    new_seq = np.append(new_seq, [pred])
    new_seq = new_seq[1:]
    test_seq = torch.as_tensor(new_seq).view(1, seq_length, 1).float()


true_cases = scaler.inverse_transform(
    np.expand_dims(y_test.flatten().numpy(), axis=0)
).flatten()

predicted_cases = scaler.inverse_transform(
  np.expand_dims(preds, axis=0)
).flatten()&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The code below performs predictions on the test data using the trained model and reverts the predicted results back to the original scale. It iterates through the test data, calculates the model's predictions using the current sequence, adds the predicted value to the sequence, and removes the first value to maintain the sequence length. The predictions are then transformed back to the original scale to allow comparison between the predicted and actual values. This helps evaluate how well the model predicts future values of the time series data.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716466651354&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;plt.plot(
  daily_cases.index[:len(train_data)], 
  scaler.inverse_transform(train_data).flatten(),
  label='Historical Daily Cases'
)

plt.plot(
  daily_cases.index[len(train_data):len(train_data) + len(true_cases)], 
  true_cases,
  label='Real Daily Cases'
)

plt.plot(
  daily_cases.index[len(train_data):len(train_data) + len(true_cases)], 
  predicted_cases, 
  label='Predicted Daily Cases'
)

plt.legend();&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1699&quot; data-origin-height=&quot;1161&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bp1PPR/btsHA0iHZVH/WkAAJO8QYvaOQvZojN0njk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bp1PPR/btsHA0iHZVH/WkAAJO8QYvaOQvZojN0njk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bp1PPR/btsHA0iHZVH/WkAAJO8QYvaOQvZojN0njk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbp1PPR%2FbtsHA0iHZVH%2FWkAAJO8QYvaOQvZojN0njk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;543&quot; height=&quot;371&quot; data-origin-width=&quot;1699&quot; data-origin-height=&quot;1161&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The performance of the predictions is not good. This is likely because the model has a very high dependency on the first predicted value, and the limited amount of data exacerbates this issue.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716468903489&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;scaler = MinMaxScaler()

scaler = scaler.fit(np.expand_dims(daily_cases, axis=1))

all_data = scaler.transform(np.expand_dims(daily_cases, axis=1))

X_all, y_all = create_sequences(all_data, seq_length)

X_all = torch.from_numpy(X_all).float()
y_all = torch.from_numpy(y_all).float()

model = CoronaVirusPredictor(
  n_features=1, 
  n_hidden=512, 
  seq_len=seq_length, 
  n_layers=2
)
model, train_hist, _ = train_model(model, X_all, y_all)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716468968269&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;DAYS_TO_PREDICT = 12

with torch.no_grad():
  test_seq = X_all[:1]
  preds = []
  for _ in range(DAYS_TO_PREDICT):
    y_test_pred = model(test_seq)
    pred = torch.flatten(y_test_pred).item()
    preds.append(pred)
    new_seq = test_seq.numpy().flatten()
    new_seq = np.append(new_seq, [pred])
    new_seq = new_seq[1:]
    test_seq = torch.as_tensor(new_seq).view(1, seq_length, 1).float()

predicted_cases = scaler.inverse_transform(np.expand_dims(preds, axis=0)).flatten()

predicted_index = pd.date_range(
  start=daily_cases.index[-1],
  periods=DAYS_TO_PREDICT + 1,
  closed='right'
)

predicted_cases = pd.Series(
  data=predicted_cases,
  index=predicted_index
)

plt.plot(daily_cases, label='Historical Daily Cases')
plt.plot(predicted_cases, label='Predicted Daily Cases')
plt.legend();&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1726&quot; data-origin-height=&quot;1161&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cvEZY0/btsHBegWNCW/hNXeOHteqrezOtsYM8zVNk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cvEZY0/btsHBegWNCW/hNXeOHteqrezOtsYM8zVNk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cvEZY0/btsHBegWNCW/hNXeOHteqrezOtsYM8zVNk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcvEZY0%2FbtsHBegWNCW%2FhNXeOHteqrezOtsYM8zVNk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;713&quot; height=&quot;480&quot; data-origin-width=&quot;1726&quot; data-origin-height=&quot;1161&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;We have explored an example of predicting future values using time series data. However, forecasting time series data can be quite challenging, and the accuracy may be low. For detailed explanations and code, please refer to the references below.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #666666; text-align: start;&quot;&gt;ref : Venelin Valkov - Get SH_T Done with PyTorch_ Solve Real-world Machine Learning Problems with Deep Neural Networks in Python-Venelin Valkov (2020)&lt;/span&gt;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/175</guid>
      <comments>https://finite-elem.tistory.com/175#entry175comment</comments>
      <pubDate>Wed, 22 May 2024 22:21:56 +0900</pubDate>
    </item>
    <item>
      <title>172_Time Series Forecasting with LSTMs</title>
      <link>https://finite-elem.tistory.com/174</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Time series data typically captures a series of data points recorded at consistent intervals. When dealing with such time series data, Long Short-Term Memory (LSTM) models have become a highly useful and widely adopted approach. LSTM, a type of recurrent neural network (RNN), is effective for processing data sequences. In this blog post, we will explore how to use LSTM to predict future coronavirus cases based on actual data. This example uses the explanations and code from the book mentioned in the references below for detailed information.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716439548501&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;import torch

import os
import numpy as np
import pandas as pd
from tqdm import tqdm
import seaborn as sns
from pylab import rcParams
import matplotlib.pyplot as plt
from matplotlib import rc
from sklearn.preprocessing import MinMaxScaler
from pandas.plotting import register_matplotlib_converters
from torch import nn, optim&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The list of required libraries is as mentioned above, and since they have been explained in previous posts, individual explanations will not be provided. The data used is from the Johns Hopkins University Center for Systems Science and Engineering, which includes daily reported cases by country. Here, we will use only the time series data for confirmed cases.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716439712679&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;# !wget https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv
!gdown --id 1AsfdLrGESCQnRW5rbMz56A1KBc3Fe5aV
df = pd.read_csv('time_series_19-covid-Confirmed.csv')
df.head()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;834&quot; data-origin-height=&quot;204&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/VAZJI/btsHzHjDrAv/x2z0hen3RJ82OJBuLhbYIk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/VAZJI/btsHzHjDrAv/x2z0hen3RJ82OJBuLhbYIk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/VAZJI/btsHzHjDrAv/x2z0hen3RJ82OJBuLhbYIk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FVAZJI%2FbtsHzHjDrAv%2Fx2z0hen3RJ82OJBuLhbYIk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;834&quot; height=&quot;204&quot; data-origin-width=&quot;834&quot; data-origin-height=&quot;204&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;The data includes information such as provience/state, country, latitude, and longitude, but this information is not needed. Additionally, since the number of cases is cumulative, we will modify it to ensure it is not cumulative.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716440342041&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;df = df.iloc[:,4:]
df.head()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;832&quot; data-origin-height=&quot;188&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bfkb0M/btsHxLadRNK/UdPZkFPcoMsVYff2fmGO01/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bfkb0M/btsHxLadRNK/UdPZkFPcoMsVYff2fmGO01/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bfkb0M/btsHxLadRNK/UdPZkFPcoMsVYff2fmGO01/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbfkb0M%2FbtsHxLadRNK%2FUdPZkFPcoMsVYff2fmGO01%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;832&quot; height=&quot;188&quot; data-origin-width=&quot;832&quot; data-origin-height=&quot;188&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716440390791&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;daily_cases = df.sum(axis=0)
daily_cases.index = pd.to_datetime(daily_cases.index)
plt.plot(daily_cases)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2352&quot; data-origin-height=&quot;1643&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cl5OOV/btsHz5q26sP/j5xwJqMZk8oTOBFqXfKCkK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cl5OOV/btsHz5q26sP/j5xwJqMZk8oTOBFqXfKCkK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cl5OOV/btsHz5q26sP/j5xwJqMZk8oTOBFqXfKCkK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fcl5OOV%2FbtsHz5q26sP%2Fj5xwJqMZk8oTOBFqXfKCkK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;578&quot; height=&quot;404&quot; data-origin-width=&quot;2352&quot; data-origin-height=&quot;1643&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;It can be observed that the data is cumulative.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716441293854&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;daily_cases = daily_cases.diff().fillna(daily_cases[0]).astype(np.int64)
plt.plot(daily_cases)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2352&quot; data-origin-height=&quot;1643&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/u4PTt/btsHAg6Zgwb/cDcQSoaRKtAkXKZnNidO7K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/u4PTt/btsHAg6Zgwb/cDcQSoaRKtAkXKZnNidO7K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/u4PTt/btsHAg6Zgwb/cDcQSoaRKtAkXKZnNidO7K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fu4PTt%2FbtsHAg6Zgwb%2FcDcQSoaRKtAkXKZnNidO7K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;588&quot; height=&quot;411&quot; data-origin-width=&quot;2352&quot; data-origin-height=&quot;1643&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Using &lt;b&gt;diff&lt;/b&gt;, we can easily adjust the data to make it non-cumulative.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716442373085&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;test_data_size = 14
train_data = daily_cases[:-test_data_size]
test_data = daily_cases[-test_data_size:]

scaler = MinMaxScaler()
scaler = scaler.fit(np.expand_dims(train_data,axis = 1))
train_data = scaler.transform(np.expand_dims(train_data, axis = 1))
test_data = scaler.transform(np.expand_dims(test_data, axis = 1))

def create_sequences(data, seq_length):
  xs = []
  ys = []
  for i in range(len(data) - seq_length - 1):
    x = data[i:(i+seq_length)]
    y = data[i+seq_length]
    xs.append(x)
    ys.append(y)
    return np.array(xs), np.array(ys)

seq_length = 5
X_train, y_train = create_sequences(train_data, seq_length)
X_test, y_test = create_sequences(test_data, seq_length)

X_train = torch.from_numpy(X_train).float()
y_train = torch.from_numpy(y_train).float()

X_test = torch.from_numpy(X_test).float()
y_test = torch.from_numpy(y_test).float()&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;The code splits the time seires data into training and testing datasets, normalizes the data, and creates sequences to prepare it for model training. Finally, it converts the data into PyTorch tensors for use as model input.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716443380907&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;class CoronaVirusPredictor(nn.Module):

  def __init__(self, n_features, n_hidden, seq_len, n_layers=2):
    super(CoronaVirusPredictor, self).__init__()

    self.n_hidden = n_hidden
    self.seq_len = seq_len
    self.n_layers = n_layers

    self.lstm = nn.LSTM(
      input_size=n_features,
      hidden_size=n_hidden,
      num_layers=n_layers,
      dropout=0.5
    )

    self.linear = nn.Linear(in_features=n_hidden, out_features=1)

  def reset_hidden_state(self):
    self.hidden = (
        torch.zeros(self.n_layers, self.seq_len, self.n_hidden),
        torch.zeros(self.n_layers, self.seq_len, self.n_hidden)
    )

  def forward(self, sequences):
    lstm_out, self.hidden = self.lstm(
      sequences.view(len(sequences), self.seq_len, -1),
      self.hidden
    )
    last_time_step = \
      lstm_out.view(self.seq_len, len(sequences), self.n_hidden)[-1]
    y_pred = self.linear(last_time_step)
    return y_pred&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The code below defines an LSTM-based neural network model. Using &lt;/span&gt;&lt;b&gt;nn.LSTM&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;, the model structure and layers can be simply defined and initialized. The &lt;/span&gt;&lt;b&gt;reset_hidden_state&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function initializes the hidden state and cell state by creating tensors with all values set to zero. The &lt;/span&gt;&lt;b&gt;forward&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function passes the input through the LSTM layer to obtain the output and updates the hidden state. The &lt;/span&gt;&lt;b&gt;last_time_step&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; extracts the output from the last time step of the LSTM, and this is passed through the final linear layer to compute the prediction. Continued in the next post..&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #666666; text-align: start;&quot;&gt;ref : Venelin Valkov - Get SH_T Done with PyTorch_ Solve Real-world Machine Learning Problems with Deep Neural Networks in Python-Venelin Valkov (2020)&lt;/span&gt;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/174</guid>
      <comments>https://finite-elem.tistory.com/174#entry174comment</comments>
      <pubDate>Tue, 21 May 2024 23:45:22 +0900</pubDate>
    </item>
    <item>
      <title>171_Image Classification using Torchvision(4)</title>
      <link>https://finite-elem.tistory.com/173</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;Continuing from the previous post(&lt;/span&gt;&lt;a href=&quot;https://finite-elem.tistory.com/172&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;170_Image Classification using Torchvision(3)&lt;/a&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;&lt;/span&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;)&lt;/span&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716382880082&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def show_confusion_matrix(confusion_matrix, class_names):
    cm = confusion_matrix.copy()

    cell_counts = cm.flatten()

    cm_row_norm = cm / cm.sum(axis=1)[:, np.newaxis]

    row_percentages = [&quot;{0:.2f}&quot;.format(value) for value in cm_row_norm.flatten()]

    cell_labels = [f&quot;{cnt}\n{per}&quot; for cnt, per in zip(cell_counts, row_percentages)]
    cell_labels = np.asarray(cell_labels).reshape(cm.shape[0], cm.shape[1])

    df_cm = pd.DataFrame(cm_row_norm, index=class_names, columns=class_names)

    hmap = sns.heatmap(df_cm, annot=cell_labels, fmt=&quot;&quot;, cmap=&quot;Blues&quot;)
    hmap.yaxis.set_ticklabels(hmap.yaxis.get_ticklabels(), rotation=0, ha='right')
    hmap.xaxis.set_ticklabels(hmap.xaxis.get_ticklabels(), rotation=30, ha='right')
    plt.ylabel('True Sign')
    plt.xlabel('Predicted Sign');


cm = confusion_matrix(y_test, y_pred)
show_confusion_matrix(cm, class_names)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2037&quot; data-origin-height=&quot;1490&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cAhPVI/btsHzaF4wrA/RK5eKV2jGrxx1njjLTuOW0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cAhPVI/btsHzaF4wrA/RK5eKV2jGrxx1njjLTuOW0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cAhPVI/btsHzaF4wrA/RK5eKV2jGrxx1njjLTuOW0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcAhPVI%2FbtsHzaF4wrA%2FRK5eKV2jGrxx1njjLTuOW0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;607&quot; height=&quot;444&quot; data-origin-width=&quot;2037&quot; data-origin-height=&quot;1490&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The above code defines a function to visualize the confusion matrix and uses it to evaluate the model's prediction performance. It flattens the confusion matrix into a 1D array to get the values of each cell, then normalizes each row by dividing by the sum of that row to convert the counts to proportions. It formats each value to two decimal places, converts it to a 1D array, combines the actual count and proportion for each cell to create labels, and reshapes this label array back to the original shape of the confusion matrix. The normalized confusion matrix is converted to a Pandas DataFrame with class names set for rows and columns, and a heatmap is generated using Seaborn.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Now, we'll see how to classify new data that is not included in the dataset. The example image to be used is as follows.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1272&quot; data-origin-height=&quot;1272&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b1vH8Y/btsHyod18AM/3CMmn5bQh96KiRXCZjGJyK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b1vH8Y/btsHyod18AM/3CMmn5bQh96KiRXCZjGJyK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b1vH8Y/btsHyod18AM/3CMmn5bQh96KiRXCZjGJyK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb1vH8Y%2FbtsHyod18AM%2F3CMmn5bQh96KiRXCZjGJyK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;339&quot; height=&quot;339&quot; data-origin-width=&quot;1272&quot; data-origin-height=&quot;1272&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716383780144&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def predict_proba(model, image_path):
  img = Image.open(image_path)
  img = img.convert('RGB')
  img = transforms['test'](img).unsqueeze(0)

  pred = model(img.to(device))
  pred = F.softmax(pred, dim=1)
  return pred.detach().cpu().numpy().flatten()


pred = predict_proba(base_model, 'stop-sign.jpg')
pred&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;545&quot; data-origin-height=&quot;26&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/rBS9h/btsHyod2bTb/F9ynum5tKhJC0BsOjnFEBk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/rBS9h/btsHyod2bTb/F9ynum5tKhJC0BsOjnFEBk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/rBS9h/btsHyod2bTb/F9ynum5tKhJC0BsOjnFEBk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FrBS9h%2FbtsHyod2bTb%2FF9ynum5tKhJC0BsOjnFEBk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;545&quot; height=&quot;26&quot; data-origin-width=&quot;545&quot; data-origin-height=&quot;26&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Plotted for easier understanding, it looks as follows.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716383931883&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def show_prediction_confidence(prediction, class_names):
    pred_df = pd.DataFrame({
    'class_names': class_names,
    'values': prediction
    })
    sns.barplot(x='values', y='class_names', data=pred_df, orient='h')
    plt.xlim([0, 1]);
show_prediction_confidence(pred, class_names)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2202&quot; data-origin-height=&quot;1379&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/n67CM/btsHxI46C3L/ZSh2icPk5ojxo00tLbdCxk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/n67CM/btsHxI46C3L/ZSh2icPk5ojxo00tLbdCxk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/n67CM/btsHxI46C3L/ZSh2icPk5ojxo00tLbdCxk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fn67CM%2FbtsHxI46C3L%2FZSh2icPk5ojxo00tLbdCxk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;527&quot; height=&quot;330&quot; data-origin-width=&quot;2202&quot; data-origin-height=&quot;1379&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The model correctly predicted the stop sign. When performing predictions on new data not included in the dataset, the model showed good performance. Of course, more examples should be examined, but for the blog post, this should suffice. Detailed information and code can be found in the references below.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #666666; text-align: start;&quot;&gt;ref : Venelin Valkov - Get SH_T Done with PyTorch_ Solve Real-world Machine Learning Problems with Deep Neural Networks in Python-Venelin Valkov (2020)&lt;/span&gt;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/173</guid>
      <comments>https://finite-elem.tistory.com/173#entry173comment</comments>
      <pubDate>Mon, 20 May 2024 21:51:43 +0900</pubDate>
    </item>
    <item>
      <title>170_Image Classification using Torchvision(3)</title>
      <link>https://finite-elem.tistory.com/172</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;Continuing from the previous post(&lt;/span&gt;&lt;a href=&quot;https://finite-elem.tistory.com/171&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;169_Image Classification using Torchvision(2)&lt;/a&gt;)&lt;span style=&quot;color: #222222; text-align: start;&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716303479058&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;%%time

base_model, history = train_model(base_model, data_loaders, dataset_sizes, device)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;469&quot; data-origin-height=&quot;342&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bKNkgj/btsHwggCiMz/8Uw5pQFUK8pPgeXGZsCtDk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bKNkgj/btsHwggCiMz/8Uw5pQFUK8pPgeXGZsCtDk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bKNkgj/btsHwggCiMz/8Uw5pQFUK8pPgeXGZsCtDk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbKNkgj%2FbtsHwggCiMz%2F8Uw5pQFUK8pPgeXGZsCtDk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;469&quot; height=&quot;342&quot; data-origin-width=&quot;469&quot; data-origin-height=&quot;342&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;To make the results easier to interpret, add a visualization function.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716306014414&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def plot_training_history(history):
  fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 6))

  ax1.plot(history['train_loss'], label='train loss')
  ax1.plot(history['val_loss'], label='validation loss')

  ax1.xaxis.set_major_locator(MaxNLocator(integer=True))
  ax1.set_ylim([-0.05, 1.05])
  ax1.legend()
  ax1.set_ylabel('Loss')
  ax1.set_xlabel('Epoch')

  ax2.plot(history['train_acc'], label='train accuracy')
  ax2.plot(history['val_acc'], label='validation accuracy')

  ax2.xaxis.set_major_locator(MaxNLocator(integer=True))
  ax2.set_ylim([-0.05, 1.05])
  ax2.legend()

  ax2.set_ylabel('Accuracy')
  ax2.set_xlabel('Epoch')

  fig.suptitle('Training history')
  
plot_training_history(history)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The original code is written like this, but the following error occurred.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;b&gt; &lt;span style=&quot;text-align: start;&quot;&gt;can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.&lt;/span&gt; &lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;The error ocurred because &lt;b&gt;train_acc&lt;/b&gt; and &lt;b&gt;val_acc&lt;/b&gt; in &lt;b&gt;history&lt;/b&gt; were set to cuda. Therefore, I modified the code to move them to cpu for execution.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716306227579&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def to_numpy(tensor):
    if isinstance(tensor, torch.Tensor):
        return tensor.cpu().numpy()
    return tensor

def plot_training_history(history):
    train_loss = [to_numpy(x) for x in history['train_loss']]
    val_loss = [to_numpy(x) for x in history['val_loss']]
    train_acc = [to_numpy(x) for x in history['train_acc']]
    val_acc = [to_numpy(x) for x in history['val_acc']]
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 6))

    ax1.plot(train_loss, label='train loss')
    ax1.plot(val_loss, label='validation loss')

    ax1.xaxis.set_major_locator(MaxNLocator(integer=True))
    ax1.set_ylim([-0.05, 1.05])
    ax1.legend()
    ax1.set_ylabel('Loss')
    ax1.set_xlabel('Epoch')

    ax2.plot(train_acc, label='train accuracy')
    ax2.plot(val_acc, label='validation accuracy')

    ax2.xaxis.set_major_locator(MaxNLocator(integer=True))
    ax2.set_ylim([-0.05, 1.05])
    ax2.legend()

    ax2.set_ylabel('Accuracy')
    ax2.set_xlabel('Epoch')

    fig.suptitle('Training history')

plot_training_history(history)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2954&quot; data-origin-height=&quot;1191&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/B74VT/btsHym66CVq/TvcbijC98iWKIkeRPHbzs1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/B74VT/btsHym66CVq/TvcbijC98iWKIkeRPHbzs1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/B74VT/btsHym66CVq/TvcbijC98iWKIkeRPHbzs1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FB74VT%2FbtsHym66CVq%2FTvcbijC98iWKIkeRPHbzs1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;766&quot; height=&quot;309&quot; data-origin-width=&quot;2954&quot; data-origin-height=&quot;1191&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;The pre-trained model showed high accuracy and low loss even after only 3 epochs.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716306980755&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def show_predictions(model, class_names, n_images=6):
  model = model.eval()
  images_handeled = 0
  plt.figure()

  with torch.no_grad():
    for i, (inputs, labels) in enumerate(data_loaders['test']):
      inputs = inputs.to(device)
      labels = labels.to(device)

      outputs = model(inputs)
      _, preds = torch.max(outputs, 1)

      for j in range(inputs.shape[0]):
        images_handeled += 1
        ax = plt.subplot(2, n_images//2, images_handeled)
        ax.set_title(f'predicted: {class_names[preds[j]]}')
        imshow(inputs.cpu().data[j])
        ax.axis('off')

        if images_handeled == n_images:
          return
      
show_predictions(base_model, class_names, n_images=8)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1905&quot; data-origin-height=&quot;1161&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cBW9mT/btsHwL8iJYA/dwuYSEFFQ32oTbiok2EY60/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cBW9mT/btsHwL8iJYA/dwuYSEFFQ32oTbiok2EY60/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cBW9mT/btsHwL8iJYA/dwuYSEFFQ32oTbiok2EY60/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcBW9mT%2FbtsHwL8iJYA%2FdwuYSEFFQ32oTbiok2EY60%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;679&quot; height=&quot;414&quot; data-origin-width=&quot;1905&quot; data-origin-height=&quot;1161&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The above code defines a function that uses the given model to perform predictions on the test data and visualizes the predicted results. In the first &lt;/span&gt;for&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; loop, it iterates through the data in batches using the test data loader, selecting the class index with the highest value in the output as the predicted value. In the second &lt;/span&gt;for&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; loop, it iterates through each image within the batch and continues until the specified number of visualized images is reached.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;It can be observed that the model accurately predicts even very blurry or obscured parts.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716307376040&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def get_predictions(model, data_loader):
  model = model.eval()
  predictions = []
  real_values = []
  with torch.no_grad():
    for inputs, labels in data_loader:
      inputs = inputs.to(device)
      labels = labels.to(device)

      outputs = model(inputs)
      _, preds = torch.max(outputs, 1)
      predictions.extend(preds)
      real_values.extend(labels)
  predictions = torch.as_tensor(predictions).cpu()
  real_values = torch.as_tensor(real_values).cpu()
  return predictions, real_values

y_pred, y_test = get_predictions(base_model, data_loaders['test'])
print(classification_report(y_test, y_pred, target_names=class_names))&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;428&quot; data-origin-height=&quot;192&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/eFcK6Z/btsHxFlXVoM/fCzB42uNrhsBJTLkqzwPak/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/eFcK6Z/btsHxFlXVoM/fCzB42uNrhsBJTLkqzwPak/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/eFcK6Z/btsHxFlXVoM/fCzB42uNrhsBJTLkqzwPak/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FeFcK6Z%2FbtsHxFlXVoM%2FfCzB42uNrhsBJTLkqzwPak%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;428&quot; height=&quot;192&quot; data-origin-width=&quot;428&quot; data-origin-height=&quot;192&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The above code defines a function that uses the given model to perform predictions on the data loader and returns the actual and predicted values. Then, it uses this function to make predictions on the test data and prints a report evaluating the model's performance. The report shows that the model is surprisingly accurate. &lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Continued in the next post..&lt;/span&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #666666; text-align: start;&quot;&gt;ref : Venelin Valkov - Get SH_T Done with PyTorch_ Solve Real-world Machine Learning Problems with Deep Neural Networks in Python-Venelin Valkov (2020)&lt;/span&gt;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/172</guid>
      <comments>https://finite-elem.tistory.com/172#entry172comment</comments>
      <pubDate>Sun, 19 May 2024 23:06:49 +0900</pubDate>
    </item>
    <item>
      <title>169_Image Classification using Torchvision(2)</title>
      <link>https://finite-elem.tistory.com/171</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Continuing from the previous post(&lt;a href=&quot;https://finite-elem.tistory.com/170&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;168_Image Classification using Torchvision&lt;/a&gt;).&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716219763514&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;image_datasets = {
  d: ImageFolder(f'{DATA_DIR}/{d}', transforms[d]) for d in DATASETS
}

data_loaders = {
  d: DataLoader(image_datasets[d], batch_size=4, shuffle=True, num_workers=4) 
  for d in DATASETS
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Here, to facilitate easier training, PyTorch datasets and data loaders are created for each image dataset folder. Data loaders provide data in batches. The &lt;/span&gt;&lt;b&gt;image_datasets&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; dictionary loads the image data for each dataset and stores an &lt;/span&gt;&lt;b&gt;ImageFolder&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; object with the corresponding transformations applied. The &lt;/span&gt;&lt;b&gt;data_loaders&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; dictionary stores a &lt;/span&gt;&lt;b&gt;DataLoader&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; object for each dataset, which loads data in batches. This setup efficiently provides data during model training and evaluation.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716219939751&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def imshow(inp, title=None):
  inp = inp.numpy().transpose((1, 2, 0))
  mean = np.array([mean_nums])
  std = np.array([std_nums])
  inp = std * inp + mean
  inp = np.clip(inp, 0, 1)
  plt.imshow(inp)
  if title is not None:
    plt.title(title)
  plt.axis('off')

inputs, classes = next(iter(data_loaders['train']))
out = torchvision.utils.make_grid(inputs)

imshow(out, title=[class_names[x] for x in classes])&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The following code provides examples of images with applied transformations. To display the images correctly, the applied normalization needs to be reversed, and the color channels need to be rearranged. The &lt;/span&gt;&lt;b&gt;imshow&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function takes the image tensor (&lt;/span&gt;&lt;b&gt;inp&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;) and a title (&lt;/span&gt;&lt;b&gt;title&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;) as input. It converts the PyTorch tensor to a numpy array and changes the dimensions. To de-normalize, it creates numpy arrays for the mean and standard deviation values and uses them to convert the normalized image back to its original values.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716220471687&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def create_model(n_classes):
  model = models.resnet34(pretrained=True)

  n_features = model.fc.in_features
  model.fc = nn.Linear(n_features, n_classes)

  return model.to(device)

base_model = create_model(len(class_names))&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Instead of constructing a model from scratch, use the architecture of the ResNet model. Additionally, utilize the weights of a model pre-trained on the ImageNet dataset. The &lt;/span&gt;&lt;b&gt;create_model&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function takes the number of classes to classify as input and loads a pre-trained ResNet-34 model. This model uses weights pre-trained on the ImageNet dataset. The function retrieves the number of input features for the last fully connected layer of the ResNet-34 model via &lt;/span&gt;&lt;b&gt;model.fc.in_features&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; and replaces the final fully connected layer with a new one using &lt;/span&gt;&lt;b&gt;nn.Linear&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; to match the number of classes to be classified.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716220763856&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def train_epoch(
  model, 
  data_loader, 
  loss_fn, 
  optimizer, 
  device, 
  scheduler, 
  n_examples
):
  model = model.train()

  losses = []
  correct_predictions = 0
  
  for inputs, labels in data_loader:
    inputs = inputs.to(device)
    labels = labels.to(device)

    outputs = model(inputs)

    _, preds = torch.max(outputs, dim=1)
    loss = loss_fn(outputs, labels)

    correct_predictions += torch.sum(preds == labels)
    losses.append(loss.item())

    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

  scheduler.step()

  return correct_predictions.double() / n_examples, np.mean(losses)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;train_epoch&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; is a function that trains the given model for one epoch. It sets the model to training mode and initializes a list to store the loss values for each batch and a variable to count the number of correctly predicted samples. Then, it iterates over the data loader in batches. The input data is passed to the model to compute the output values, and the class index with the highest value is selected as the predicted value. The output values are compared with the actual labels to compute the loss, the number of correctly predicted samples is accumulated, and the current batch's loss value is added to the list. The loss is then backpropagated to compute the gradients, and the weights are updated using the optimization algorithm, followed by gradient reset. Finally, the learning rate scheduler is updated to adjust the learning rate, and the function returns the accuracy and average loss values.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716221011787&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def eval_model(model, data_loader, loss_fn, device, n_examples):
  model = model.eval()

  losses = []
  correct_predictions = 0

  with torch.no_grad():
    for inputs, labels in data_loader:
      inputs = inputs.to(device)
      labels = labels.to(device)

      outputs = model(inputs)

      _, preds = torch.max(outputs, dim=1)

      loss = loss_fn(outputs, labels)

      correct_predictions += torch.sum(preds == labels)
      losses.append(loss.item())

  return correct_predictions.double() / n_examples, np.mean(losses)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The &lt;/span&gt;&lt;b&gt;eval_model&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function evaluates the given model. It switches the model to evaluation mode and iterates over the data loader in batches to perform model predictions, calculating loss and accuracy. &lt;/span&gt;&lt;b&gt;torch.no_grad()&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; is used to disable gradient calculation during evaluation, which reduces memory usage and speeds up the evaluation process.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716221167384&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def train_model(model, data_loaders, dataset_sizes, device, n_epochs=3):
  optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
  scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
  loss_fn = nn.CrossEntropyLoss().to(device)

  history = defaultdict(list)
  best_accuracy = 0

  for epoch in range(n_epochs):

    print(f'Epoch {epoch + 1}/{n_epochs}')
    print('-' * 10)

    train_acc, train_loss = train_epoch(
      model,
      data_loaders['train'],    
      loss_fn, 
      optimizer, 
      device, 
      scheduler, 
      dataset_sizes['train']
    )

    print(f'Train loss {train_loss} accuracy {train_acc}')

    val_acc, val_loss = eval_model(
      model,
      data_loaders['val'],
      loss_fn,
      device,
      dataset_sizes['val']
    )

    print(f'Val   loss {val_loss} accuracy {val_acc}')
    print()

    history['train_acc'].append(train_acc)
    history['train_loss'].append(train_loss)
    history['val_acc'].append(val_acc)
    history['val_loss'].append(val_loss)

    if val_acc &amp;gt; best_accuracy:
      torch.save(model.state_dict(), 'best_model_state.bin')
      best_accuracy = val_acc

  print(f'Best val accuracy: {best_accuracy}')
  
  model.load_state_dict(torch.load('best_model_state.bin'))

  return model, history&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The &lt;/span&gt;&lt;b&gt;train_model&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function trains the model for a specified number of epochs and evaluates the training and validation performance at each epoch. It saves the model with the best performance and ultimately returns the best-performing model and the training history. Here, the optimizer uses the Stochastic Gradient Descent (&lt;b&gt;SGD&lt;/b&gt;) optimization algorithm, and the loss function is &lt;/span&gt;&lt;b&gt;CrossEntropyLoss&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;. In the loop, the model is trained and evaluated for each epoch, and the performance is printed. The accuracy and loss are added to &lt;/span&gt;&lt;b&gt;history&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;, and if the current validation accuracy is higher than the best validation accuracy, the model weights are saved to the file &lt;/span&gt;&lt;b&gt;best_model_state.bin&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;. Finally, the highest validation accuracy is printed, and the saved model state is loaded to retrieve the best model. &lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Continued in the next post..&lt;/span&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #666666; text-align: start;&quot;&gt;ref : Venelin Valkov - Get SH_T Done with PyTorch_ Solve Real-world Machine Learning Problems with Deep Neural Networks in Python-Venelin Valkov (2020)&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/171</guid>
      <comments>https://finite-elem.tistory.com/171#entry171comment</comments>
      <pubDate>Sat, 18 May 2024 22:48:31 +0900</pubDate>
    </item>
    <item>
      <title>168_Image Classification using Torchvision</title>
      <link>https://finite-elem.tistory.com/170</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;I found an example of image classification using transfer learning, and I plan to follow the code to study it. The reference book is mentioned at the end of this post. The explanations are detailed, making it a good resource for studying. First, the required libraries are as follows.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716198021433&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;import torch, torchvision

from pathlib import Path
import numpy as np
import cv2
import pandas as pd
from tqdm import tqdm
import PIL.Image as Image
import seaborn as sns
from pylab import rcParams
import matplotlib.pyplot as plt
from matplotlib import rc
from matplotlib.ticker import MaxNLocator
from torch.optim import lr_scheduler
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
from glob import glob
import shutil
from collections import defaultdict

from torch import nn, optim

import torch.nn.functional as F
import torchvision.transforms as T
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
from torchvision import models&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;&lt;b&gt;pathlib.Path&lt;/b&gt; for easy file path and file system operations, &lt;b&gt;cv2&lt;/b&gt; for image and video processing, &lt;b&gt;Image&lt;/b&gt; module of Python Imaging Library (PIL) to open, convert, and save images, &lt;b&gt;seaborn&lt;/b&gt; for data visualization and graph generation, &lt;b&gt;pylab.rcParams&lt;/b&gt; to adjust basic settings and styles of graphs, &lt;b&gt;MaxNLocator&lt;/b&gt; from Matplotlib to automatically adjust tick positions for better readability,&lt;b&gt; lr_scheduler&lt;/b&gt; in PyTorch to dynamically adjust the learning rate,&lt;b&gt; glob&lt;/b&gt; to find file paths matching specific patterns,&lt;b&gt; shutil&lt;/b&gt; for file and directory operations, &lt;b&gt;collections.defaultdict&lt;/b&gt; to create dictionaries with default values if the key is missing, &lt;b&gt;torchvision.datasets.ImageFolder&lt;/b&gt; to easily load image datasets organized in a directory structure, and &lt;b&gt;torchvision.models&lt;/b&gt; which includes various deep learning model architectures, including pre-trained models.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #0d0d0d;&quot;&gt;&lt;span style=&quot;background-color: #ffffff;&quot;&gt;The dataset consists of over 50000 annotated images of more than 40 traffic signs, which can be downloaded from here(&lt;a href=&quot;https://benchmark.ini.rub.de/?section=gtsrb&amp;amp;subsection=dataset&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://benchmark.ini.rub.de/?section=gtsrb&amp;amp;subsection=dataset&lt;/a&gt;).&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716204792027&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;train_folders = sorted(glob('GTSRB/Final_Training/Images/*'))&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Use &lt;b&gt;sorted&lt;/b&gt; and &lt;b&gt;glob&lt;/b&gt; to store the file paths of the downloaded images.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716204921603&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def load_image(img_path, resize=True):
  img = cv2.cvtColor(cv2.imread(img_path), cv2.COLOR_BGR2RGB)

  if resize:
    img = cv2.resize(img, (64, 64), interpolation = cv2.INTER_AREA)

  return img

def show_image(img_path):
  img = load_image(img_path)
  plt.imshow(img)
  plt.axis('off')

def show_sign_grid(image_paths):
  images = [load_image(img) for img in image_paths]
  images = torch.as_tensor(images)
  images = images.permute(0, 3, 1, 2)
  grid_img = torchvision.utils.make_grid(images, nrow=11)
  plt.figure(figsize=(24, 12))
  plt.imshow(grid_img.permute(1, 2, 0))
  plt.axis('off');&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;The &lt;b&gt;load_image&lt;/b&gt; function takes the path of an image as input and returns the output. The &lt;b&gt;resize&lt;/b&gt; parameter determines whether to resize the image after loading it. &lt;b&gt;cv2.imread&lt;/b&gt; reads the image from the specified path, and&lt;b&gt; cv2.cvtColor&lt;/b&gt; converts the image from BGR format to RGB format. If resize is True, the image is resized to 64x64.&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;The &lt;b&gt;show_image&lt;/b&gt; function simply loads and displays an image.&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;The &lt;b&gt;show_sign_grid&lt;/b&gt; function uses a for loop to load each image file from the image paths into a list, converts this list to a tensor, and changes its dimensions using &lt;b&gt;permute&lt;/b&gt; (batch size, channel, height, width). Then, it arranges the images into a grid with 11 images per row using &lt;b&gt;make_grid&lt;/b&gt; and displays the images.&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;The images can be visualized as follows.&lt;/p&gt;
&lt;pre id=&quot;code_1716205280946&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;sample_images = [np.random.choice(glob(f'{tf}/*ppm')) for tf in train_folders]
show_sign_grid(sample_images)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;3760&quot; data-origin-height=&quot;1399&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/RWbVM/btsHwtZVtxQ/JD5TstGMKiCGtCtdCQhMdK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/RWbVM/btsHwtZVtxQ/JD5TstGMKiCGtCtdCQhMdK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/RWbVM/btsHwtZVtxQ/JD5TstGMKiCGtCtdCQhMdK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FRWbVM%2FbtsHwtZVtxQ%2FJD5TstGMKiCGtCtdCQhMdK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;634&quot; height=&quot;236&quot; data-origin-width=&quot;3760&quot; data-origin-height=&quot;1399&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716206011450&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;class_names = ['priority_road', 'give_way', 'stop', 'no_entry']
class_indices = [12, 13, 14, 17]

!rm -rf data

DATA_DIR = Path('data')

DATASETS = ['train', 'val', 'test']

for ds in DATASETS:
  for cls in class_names:
    (DATA_DIR / ds / cls).mkdir(parents=True, exist_ok=True)
    
for i, cls_index in enumerate(class_indices):
  image_paths = np.array(glob(f'{train_folders[cls_index]}/*.ppm'))
  class_name = class_names[i]
  print(f'{class_name}: {len(image_paths)}')
  np.random.shuffle(image_paths)

  ds_split = np.split(
    image_paths, 
    indices_or_sections=[int(.8*len(image_paths)), int(.9*len(image_paths))]
  )

  dataset_data = zip(DATASETS, ds_split)

  for ds, images in dataset_data:
    for img_path in images:
      shutil.copy(img_path, f'{DATA_DIR}/{ds}/{class_name}/')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;161&quot; data-origin-height=&quot;82&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/GI4So/btsHuyn6tyl/0P8OSH7INs8Mbs4taRFuSK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/GI4So/btsHuyn6tyl/0P8OSH7INs8Mbs4taRFuSK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/GI4So/btsHuyn6tyl/0P8OSH7INs8Mbs4taRFuSK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FGI4So%2FbtsHuyn6tyl%2F0P8OSH7INs8Mbs4taRFuSK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;161&quot; height=&quot;82&quot; data-origin-width=&quot;161&quot; data-origin-height=&quot;82&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;Generate lists of class names and indices using &lt;b&gt;class_names&lt;/b&gt; and &lt;b&gt;class_indices&lt;/b&gt;. Set the base directory for saving the data, and divide the dataset into training, validation, and test sets. Then, create directories for each dataset type and each class using a for loop. After the loop, each dataset type will have a folder for each class.&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;In the final for loop, iterate over each class index to get the paths of .ppm image files in the specified directory, and retrieve the current class name from &lt;b&gt;class_names&lt;/b&gt;. Split the image file paths into 80% for training, 10% for validation, and 10% for testing. Use &lt;b&gt;zip&lt;/b&gt; to match dataset types with the split image file paths, and iterate through each dataset type and corresponding image file paths, copying them to the appropriate dataset directory. This code prepares the dataset for model training and evaluation.&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716219344433&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;mean_nums = [0.485, 0.456, 0.406]
std_nums = [0.229, 0.224, 0.225]

transforms = {'train': T.Compose([
  T.RandomResizedCrop(size=256),
  T.RandomRotation(degrees=15),
  T.RandomHorizontalFlip(),
  T.ToTensor(),
  T.Normalize(mean_nums, std_nums)
]), 'val': T.Compose([
  T.Resize(size=256),
  T.CenterCrop(size=224),
  T.ToTensor(),
  T.Normalize(mean_nums, std_nums)
]), 'test': T.Compose([
  T.Resize(size=256),
  T.CenterCrop(size=224),
  T.ToTensor(),
  T.Normalize(mean_nums, std_nums)
]),
}&lt;/code&gt;&lt;/pre&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;The code uses &lt;b&gt;torchvision.transforms&lt;/b&gt; for image preprocessing and data augmentation. First, &lt;b&gt;mean_nums&lt;/b&gt; and &lt;b&gt;std_nums&lt;/b&gt; represent the mean and standard deviation values for each image channel to perform normalization. The &lt;b&gt;transforms&lt;/b&gt; dictionary defines the transformations to be applied to each dataset. The purpose of &lt;b&gt;Compose&lt;/b&gt; is to apply multiple transformations sequentially.&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;RandomResizedCrop&lt;/b&gt; randomly crops the image and resizes it to 265x256, &lt;b&gt;RandomRotation&lt;/b&gt; randomly rotates the image between -15 and 15 degrees, and &lt;b&gt;RandomHorizontalFlip&lt;/b&gt; randomly flips the image horizontally. After augmentation, the image is converted to a PyTorch tensor and normalized using the given mean and standard deviation for each channel. &lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Continued in the next post..&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #666666; letter-spacing: 0px;&quot;&gt;ref : Venelin Valkov - Get SH_T Done with PyTorch_ Solve Real-world Machine Learning Problems with Deep Neural Networks in Python-Venelin Valkov (2020)&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/170</guid>
      <comments>https://finite-elem.tistory.com/170#entry170comment</comments>
      <pubDate>Fri, 17 May 2024 23:53:45 +0900</pubDate>
    </item>
    <item>
      <title>167_Simple Regression Model using PyTorch(2)</title>
      <link>https://finite-elem.tistory.com/169</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Continuing from the last posting, I will write the code and provide explanations.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716133217968&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;def calculate_accuracy(y_true, y_pred):
    predicted = y_pred.ge(.5).view(-1)
    return (y_true == predicted).sum().float() / len(y_true)

def round_tensor(t, decimal_places = 3):
    return round(t.item(), decimal_places)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The &lt;/span&gt;&lt;b&gt;calculate_accuracy&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function calculates and returns the model's prediction accuracy by determining if each element in &lt;/span&gt;&lt;b&gt;y_pred&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; is greater than or equal to 0.5 and converting these elements to True or False. The &lt;/span&gt;&lt;b&gt;ge&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function checks if each element is greater than or equal to 0.5. The &lt;/span&gt;&lt;b&gt;view(-1)&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function changes the tensor to a 1-dimensional tensor. The number of True values is then counted, converted to a float, and divided by the total number of values to calculate accuracy. The &lt;/span&gt;&lt;b&gt;round_tensor&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function takes a tensor &lt;/span&gt;&lt;b&gt;t&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; to be rounded and sets the number of decimal places according to &lt;/span&gt;&lt;b&gt;decimal_places&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;. It uses &lt;/span&gt;&lt;b&gt;t.item()&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; to convert the tensor &lt;/span&gt;&lt;b&gt;t&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;'s value to a Python number and rounds it. These two functions calculate the model's prediction accuracy and round the calculated value to the specified number of decimal places to present the results concisely.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716134077558&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;for epoch in range(1000):
    y_pred = net(X_train)
    y_pred = torch.squeeze(y_pred)
    train_loss = criterion(y_pred, y_train)
    
    if epoch % 100 == 0:
        train_acc = calculate_accuracy(y_train, y_pred)
        y_test_pred = net(X_test)
        y_test_pred = torch.squeeze(y_test_pred)
        test_loss = criterion(y_test_pred, y_test)
        test_acc = calculate_accuracy(y_test, y_test_pred)
        print(f'''epoch {epoch} Train set - loss: {round_tensor(train_loss)}, accuracy: {round_tensor(train_acc)}
        Test  set - loss: {round_tensor(test_loss)}, accuracy: {round_tensor(test_acc)}
        ''')
        
        optimizer.zero_grad()
        train_loss.backward()
        optimizer.step()&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;In the training loop for training the neural network, training is conducted for 1000 epochs, using &lt;/span&gt;&lt;b&gt;X_train&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; as input to the network to calculate predictions. The &lt;/span&gt;&lt;b&gt;squeeze&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function is used here to convert the tensor to a 1-dimensional tensor. The &lt;/span&gt;&lt;b&gt;criterion&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; compares the predicted values &lt;/span&gt;&lt;b&gt;y_pred&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; and the actual values &lt;/span&gt;&lt;b&gt;y_train&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; to calculate the loss. Every 100 epochs, the loss and accuracy on both the training and test sets are calculated and printed. &lt;/span&gt;&lt;b&gt;train_acc&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; calculates the accuracy for the training data, while &lt;/span&gt;&lt;b&gt;y_test_pred&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; uses &lt;/span&gt;&lt;b&gt;X_test&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; as input to the model to get the predicted values for the test data. &lt;/span&gt;&lt;b&gt;test_acc&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; calculates the accuracy for the test data. &lt;/span&gt;&lt;b&gt;optimizer.zero_grad()&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; resets the gradients from the previous step, &lt;/span&gt;&lt;b&gt;train_loss.backward()&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; calculates the gradients of the loss function through backpropagation, and &lt;/span&gt;&lt;b&gt;optimizer.step()&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; updates the model parameters using the calculated gradients.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;This process allows the model to be trained iteratively while monitoring its performance.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;398&quot; data-origin-height=&quot;556&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/c4gt9E/btsHtNk5oGy/V3Bg4qJeU5gVlbPlwO5fL0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/c4gt9E/btsHtNk5oGy/V3Bg4qJeU5gVlbPlwO5fL0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/c4gt9E/btsHtNk5oGy/V3Bg4qJeU5gVlbPlwO5fL0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fc4gt9E%2FbtsHtNk5oGy%2FV3Bg4qJeU5gVlbPlwO5fL0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;398&quot; height=&quot;556&quot; data-origin-width=&quot;398&quot; data-origin-height=&quot;556&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716134725935&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;classes = ['No rain', 'Raining']
y_pred = net(X_test)
y_pred = y_pred.ge(.5).view(-1).cpu()
y_test = y_test.cpu()
print(classification_report(y_test,y_pred,target_names=classes))&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Set the names of the classified classes, here defined as 'No rain' and 'Raining'. Then, move the actual values and predicted values to the CPU and use &lt;/span&gt;&lt;b&gt;classification_report&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; to print the classification report.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;413&quot; data-origin-height=&quot;152&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bcKqvx/btsHuKt7gkp/et9u9qDFy7Hkd2Kb1rkt00/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bcKqvx/btsHuKt7gkp/et9u9qDFy7Hkd2Kb1rkt00/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bcKqvx/btsHuKt7gkp/et9u9qDFy7Hkd2Kb1rkt00/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbcKqvx%2FbtsHuKt7gkp%2Fet9u9qDFy7Hkd2Kb1rkt00%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;413&quot; height=&quot;152&quot; data-origin-width=&quot;413&quot; data-origin-height=&quot;152&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;In the case of 'No rain', the model works well, while for the 'Raining' class, the model does not perform well and produces unreliable predictions.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716135110133&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;cm = confusion_matrix(y_test, y_pred)
df_cm = pd.DataFrame(cm, index=classes, columns=classes)
hmap = sns.heatmap(df_cm, annot=True, fmt = 'd')
hmap.yaxis.set_ticklabels(hmap.yaxis.get_ticklabels(), rotation=0, ha = 'right')
hmap.yaxis.set_ticklabels(hmap.yaxis.get_ticklabels(), rotation=30, ha = 'right')
plt.ylabel('True label')
plt.xlabel('Predicted label')&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Use the &lt;/span&gt;&lt;b&gt;confusion_matrix&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function to calculate the matrix comparing actual values and predicted values. Then, convert it into a DataFrame for easier visualization and use the &lt;/span&gt;&lt;b&gt;heatmap&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function to visualize the confusion matrix as a heatmap. This allows for an intuitive understanding of the model's performance.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2009&quot; data-origin-height=&quot;1379&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dRs4hj/btsHtwjy2XP/GKio003ky7AbP94Krtrmk0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dRs4hj/btsHtwjy2XP/GKio003ky7AbP94Krtrmk0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dRs4hj/btsHtwjy2XP/GKio003ky7AbP94Krtrmk0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdRs4hj%2FbtsHtwjy2XP%2FGKio003ky7AbP94Krtrmk0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;570&quot; height=&quot;391&quot; data-origin-width=&quot;2009&quot; data-origin-height=&quot;1379&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #666666;&quot;&gt;ref : Venelin Valkov - Get SH_T Done with PyTorch_ Solve Real-world Machine Learning Problems with Deep Neural Networks in Python-Venelin Valkov (2020)&lt;/span&gt;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/169</guid>
      <comments>https://finite-elem.tistory.com/169#entry169comment</comments>
      <pubDate>Thu, 16 May 2024 22:42:17 +0900</pubDate>
    </item>
    <item>
      <title>166_Simple Regression Model using PyTorch</title>
      <link>https://finite-elem.tistory.com/168</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Today, I read a book that is good for studying by following along, and it contains detailed code, so I plan to follow and analyze the code. So far, I have dealt with classification problems a lot, but now I am going to implement a simple regression model using Pytorch. The required libraries are as follows.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716130002190&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;import torch
import numpy as np
import pandas as pd
import seaborn as sns
from pylab import rcParams
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
from torch import nn, optim
import torch.nn.functional as F&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Seaborn&lt;/b&gt; is a Python library convenient for visualizing statistical data, &lt;b&gt;pylab&lt;/b&gt; is used alongside Matplotlib to easily adjust the style and settings of graphs, and the &lt;b&gt;metrics&lt;/b&gt; module from sklearn provides various indicators for evaluating model performance. Here, I'll use it to generate a &lt;b&gt;confusion matrix&lt;/b&gt; and a &lt;b&gt;classification report&lt;/b&gt; to evaluate the model's performance. The required data is &quot;rain in Australia&quot;, and the .csv file can be downloaded via(&lt;a href=&quot;https://www.kaggle.com/datasets/jsphyg/weather-dataset-rattle-package&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://www.kaggle.com/datasets/jsphyg/weather-dataset-rattle-package&lt;/a&gt;).&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716131852389&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;df = pd.read_csv('weatherAUS.csv')
cols = ['Rainfall', 'Humidity3pm', 'Pressure9am', 'RainToday', 'RainTomorrow']
df = df[cols]
df['RainToday'].replace({'No':0, 'Yes':1}, inplace = True)
df['RainTomorrow'].replace({'No':0, 'Yes':1}, inplace = True)
df = df.dropna(how='any')&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Load the .csv file into a DataFrame. Then, store the necessary column names in a list and select only those columns from the DataFrame to reconstruct it. Use the &lt;/span&gt;&lt;b&gt;replace&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function to convert the values in the 'RainToday' and 'RainTomorrow' columns from 'No' to 0 and 'Yes' to 1, using the &lt;/span&gt;&lt;b&gt;inplace&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; option to directly modify the original DataFrame. Use the &lt;/span&gt;&lt;b&gt;dropna&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function to remove all rows with missing values.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716132339496&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;X = df[['Rainfall','Humidity3pm','RainToday','Pressure9am']]
y = df[['RainTomorrow']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=RANDOM_SEED)
X_train = torch.from_numpy(X_train.to_numpy()).float()
y_train = torch.squeeze(torch.from_numpy(y_train.to_numpy()).float())
X_test = torch.from_numpy(X_test.to_numpy()).float()
y_test = torch.squeeze(torch.from_numpy(y_test.to_numpy()).float())&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Next, store the independent variables from the DataFrame in '&lt;/span&gt;X'&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; and the dependent variable in '&lt;/span&gt;y'&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;. Use &lt;/span&gt;&lt;b&gt;train_test_split&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; to divide the data into training and testing sets, with 20% of the total data used as the test set. Also, fix the random seed value to make the results reproducible. Then, convert the data to PyTorch tensors by first converting the Pandas DataFrame to a numpy array and then converting it to a float-type PyTorch tensor. The purpose of using &lt;/span&gt;squeeze&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; is to reduce the dimensions of the tensor, changing it from a 2-dimensional tensor to a 1-dimensional tensor.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1716132638394&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;class NET(nn.Module):
    def __init__(self, n_feature):
        super(NET, self).__init__()
        self.fc1 = nn.Linear(n_feature, 5)
        self.fc2 = nn.Linear(5,3)
        self.fc3 = nn.Linear(3,1)
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        return torch.sigmoid(self.fc3(x))

net = NET(X_train.shape[1])
criterion = nn.BCELoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)
X_train = X_train.to(device)
y_train = y_train.to(device)
X_test = X_test.to(device)
y_test = y_test.to(device)
net = net.to(device)
criterion = criterion.to(device)&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Next, define the neural network. It is a simple fully connected layer with 3 layers, using ReLU and Sigmoid for activation functions. The loss function used here is Binary Cross Entropy Loss (BCELoss), and the optimization algorithm is Adam. Then, move both the data and the model to the GPU for computation. Continued in the next post..&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/168</guid>
      <comments>https://finite-elem.tistory.com/168#entry168comment</comments>
      <pubDate>Wed, 15 May 2024 19:47:52 +0900</pubDate>
    </item>
    <item>
      <title>165_Debugging with Learning and Validation Curves</title>
      <link>https://finite-elem.tistory.com/167</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;If a model is too complex relative to the given training dataset, it tends to overfit the training data and may not generalize well to unseen data. To reduce the degree of overfitting, collecting more training examples can be beneficial. However, in many real-world scenarios, collecting more data is often very challenging. By plotting the model's training and validation accuracy as a function of the training dataset size, it becomes easier to detect whether the model is experiencing high variance or high bias issues and to determine if collecting more data could help resolve these problems.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;First, here is how to use scikit-learn's '&lt;b&gt;learning_curve&lt;/b&gt;' function to evaluate the model.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1715777061117&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;import matplotlib.pyplot as plt
from sklearn.model_selection import learning_curve

pipe_lr = make_pipeline(StandardScaler(),
                        LogisticRegression(penalty='l2',
                                           max_iter = 10000))
train_sizes, train_scores, test_scores = \
    learning_curve(estimator=pipe_lr,
                   X=X_train,
                   y=y_train,
                   train_sizes=np.linspace(
                       0.1, 1.0, 10),
                   cv = 10,
                   n_jobs=1)

train_mean = np.mean(train_scores, axis = 1)
train_std = np.std(train_scores, axis =1)
test_mean = np.mean(test_scores, axis = 1)
test_std = np.std(test_scores, axis=1)
plt.plot(train_sizes, train_mean, color = 'blue', marker = 'o', markersize = 5, label = 'Training accuracy')
plt.fill_between(train_sizes, train_mean+train_std, train_mean-train_std, alpha = 0.15, color = 'blue')
plt.plot(train_sizes, test_mean, color = 'green', linestyle='--', marker='s', markersize = 5, label = 'Validation accuracy')
plt.fill_between(train_sizes, test_mean+test_std, test_mean-test_std, alpha = 0.15, color = 'green')
plt.grid()
plt.xlabel('Number of training examples')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
plt.ylim([0.8,1.03])
plt.show()&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;The '&lt;/span&gt;&lt;b&gt;train_sizes&lt;/b&gt;'&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; parameter of the '&lt;/span&gt;&lt;b&gt;learning_curve&lt;/b&gt;'&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function controls the number of training examples used to generate the learning curve. In this example, we used 10 evenly spaced intervals for the training dataset size. By default, the '&lt;/span&gt;&lt;b&gt;learning_curve&lt;/b&gt;'&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function uses stratified $k$-fold cross-validation to calculate the classifier's cross-validation accuracy, and we set the &lt;/span&gt;cv&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; parameter to 10 to perform 10-fold stratified cross-validation.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Next, we compute the average accuracy from the cross-validated training and test scores for various training dataset sizes and plot these values. We also use the '&lt;/span&gt;&lt;b&gt;fill_between&lt;/b&gt;'&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; function to add the standard deviation of the mean accuracy to the plot, representing the variance of the estimates.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;576&quot; data-origin-height=&quot;432&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cu9xvX/btsHqMFt9IN/gshKUbFaTd0pQahKzlpGIK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cu9xvX/btsHqMFt9IN/gshKUbFaTd0pQahKzlpGIK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cu9xvX/btsHqMFt9IN/gshKUbFaTd0pQahKzlpGIK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fcu9xvX%2FbtsHqMFt9IN%2FgshKUbFaTd0pQahKzlpGIK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;471&quot; height=&quot;353&quot; data-origin-width=&quot;576&quot; data-origin-height=&quot;432&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;When learning more than 250 examples during training, we can observe quite good performance on both the training and validation datasets. Additionally, when the training dataset has fewer than 250 examples, the training accuracy increases while the gap between the validation accuracy and training accuracy widens, indicating an increasing degree of overfitting.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Validation curves are useful tools for improving model performance by addressing issues such as overfitting or underfitting. Validation curves are related to learning curves, but instead of plotting training and testing accuracy as a function of sample size, they plot accuracy as a function of changing model parameter values.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1715778100093&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;from sklearn.model_selection import validation_curve

param_range = [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]
train_scores, test_scores = validation_curve(
    estimator = pipe_lr,
    X = X_train,
    y = y_train,
    param_name = 'logisticregression__C',
    param_range = param_range,
    cv = 10
)

train_mean = np.mean(train_scores, axis = 1)
train_std = np.std(train_scores, axis = 1)
test_mean = np.mean(test_scores, axis = 1)
test_std = np.std(test_scores, axis = 1)
plt.plot(param_range, train_mean,
         color = 'blue', marker = 'o',
         markersize = 5, label = 'Training accuracy')
plt.fill_between(param_range, train_mean + train_std,
                 train_mean - train_std, alpha = 0.15,
                 color = 'blue')
plt.plot(param_range, test_mean,
         color = 'green', linestyle='--',
         marker = 's', markersize = 5,
         label = 'Validation accuracy')
plt.fill_between(param_range,
                 test_mean+test_std,
                 test_mean-test_std,
                 alpha = 0.15, color = 'green')
plt.grid()
plt.xscale('log')
plt.legend(loc = 'lower right')
plt.xlabel('Parameter C')
plt.ylabel('Accuracy')
plt.ylim([0.8,1.0])
plt.show()&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;584&quot; data-origin-height=&quot;442&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/5XvBZ/btsHqqXcqA5/8xmJWjXDaT3dtJlumlzzc1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/5XvBZ/btsHqqXcqA5/8xmJWjXDaT3dtJlumlzzc1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/5XvBZ/btsHqqXcqA5/8xmJWjXDaT3dtJlumlzzc1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F5XvBZ%2FbtsHqqXcqA5%2F8xmJWjXDaT3dtJlumlzzc1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;439&quot; height=&quot;332&quot; data-origin-width=&quot;584&quot; data-origin-height=&quot;442&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;Similar to the learning curve function, the '&lt;b&gt;validation_curve&lt;/b&gt;' function by default uses stratified $k$-fold cross-validation to estimate the classifier's performance. In this case, we access the '&lt;b&gt;LogisticRegression&lt;/b&gt;' object within the pipeline through the inverse regularization parameter of the logistic regression classifier specified over a range of values using the '&lt;b&gt;param_range&lt;/b&gt;' parameter. Similar to the previous code, we plot the mean training and cross-validation accuracy along with the respective standard deviations.&lt;/p&gt;
&lt;p style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot; data-ke-size=&quot;size16&quot;&gt;The differences in accuracy with changes in parameter values are subtle, but we can observe that as the regularization strength increases, the model tends to slightly underfit the data. Conversely, when the C value is large, the regularization strength decreases, causing the model to slightly overfit.&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/167</guid>
      <comments>https://finite-elem.tistory.com/167#entry167comment</comments>
      <pubDate>Tue, 14 May 2024 16:29:01 +0900</pubDate>
    </item>
    <item>
      <title>164_K-fold Cross Validation</title>
      <link>https://finite-elem.tistory.com/166</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;
&lt;script&gt; MathJax = { tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]} }; &lt;/script&gt;
&lt;script src=&quot;https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js&quot;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Various methods exist for general cross-validation techniques, such as hold-out cross-validation and k-fold cross-validation. These techniques help to reliably estimate the generalization performance of a model, that is, how well the model performs on unseen data.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;A classical and popular approach to estimate the generalization performance of machine learning models is the hold-out method. The initial dataset is divided into a training dataset and a test dataset, where the training dataset is used to train the model, and the test dataset is used to estimate the generalization performance. However, in typical machine learning processes, various parameter settings are adjusted and compared to improve prediction performance on unseen data. This process is called model selection, which means selecting the optimal hyperparameter values for a given problem. However, if the same test dataset is repeatedly used during the model selection process, there is a high possibility that the test dataset becomes part of the training data, leading to model overfitting.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Therefore, the data is divided into three parts: training, validation, and test datasets. The training dataset is used to fit various models, and the performance on the validation dataset is used for model selection. The advantage of having a test dataset that the model has not seen before during training and model selection is that it provides a less biased estimate of the model's generalization ability on new data. Once the hyperparameter tuning is complete, the generalization performance of the model is estimated using the test dataset.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;A drawback of the hold-out method is that the performance estimate can be highly sensitive to how the training dataset is partitioned. Therefore, a more robust technique, $k$-fold cross-validation, is often used for performance estimation. In this method, the training data is divided into $k$ subsets, and the hold-out method is repeated $k$ times.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Specifically, in $k$-fold cross-validation, the training dataset is randomly divided into $k$ folds. Out of these, $k$_1 folds are used for model training, and the remaining fold is used for performance evaluation. This procedure is repeated $k$ times, resulting in $k$ models and performance estimates. Then, the average performance across the different independent test folds is calculated, providing a performance estimate that is less sensitive to the partitioning of the training data compared to the hold-out method. Consequently, $k$-fold cross-validation is generally used to find the optimal hyperparameter values that provide satisfactory generalization performance.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;In summary, $k$-fold cross-validation makes better use of the dataset compared to the hold-out method that uses a validation set. This is because every data point is used for evaluation in $k$-fold cross-validation. Increasing the value of $k$ means more training data is used in each iteration, which reduces the bias when averaging the individual model estimates to estimate generalization performance. However, as $k$ increases, the execution time of the cross-validation algorithm also increases, and the variance of the estimates can rise due to the training folds becoming more similar to each other.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;A slightly improved approach to standard $k$-fold cross-validation is stratified $k$-fold cross-validation. This method can provide better bias and variance estimates, especially in cases of imbalanced class ratios. Using scikit-learn's '&lt;b&gt;StratifiedKFold&lt;/b&gt;' iterator, it can be implemented as follows&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1715767813233&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;import numpy as np
from sklearn.model_selection import StratifiedKFold
kfold = StratifiedKFold(n_splits=10).split(X_train,y_train)
scores = []
for k, (train,test) in enumerate(kfold):
    pipe_lr.fit(X_train[train], y_train[train])
    score = pipe_lr.score(X_train[test], y_train[test])
    scores.append(score)
    print(f'Fold: {k+1:02d}, '
          f'Class distr.: {np.bincount(y_train[train])}, '
          f'Acc.: {score:.3f}')&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;363&quot; data-origin-height=&quot;189&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/kHzNO/btsHpx3Kkzd/CwyYxKuKzbStlhIRqKegIk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/kHzNO/btsHpx3Kkzd/CwyYxKuKzbStlhIRqKegIk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/kHzNO/btsHpx3Kkzd/CwyYxKuKzbStlhIRqKegIk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FkHzNO%2FbtsHpx3Kkzd%2FCwyYxKuKzbStlhIRqKegIk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;363&quot; height=&quot;189&quot; data-origin-width=&quot;363&quot; data-origin-height=&quot;189&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Here, the '&lt;b&gt;StratifiedKFold&lt;/b&gt;' iterator from the '&lt;b&gt;sklearn.model_selection&lt;/b&gt;' module is initialized with the '&lt;b&gt;y_train&lt;/b&gt;' class labels of the training dataset, and the number of folds is specified through the '&lt;b&gt;n_splits&lt;/b&gt;' &lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;parameter. When iterating through the $k$ folds using the kfold iterator, a logistic regression pipeline is fitted using the train indices. The pipeline ensures that the examples are appropriately scaled in each iteration, and the accuracy score of the model is calculated using the test indices. These scores are collected to compute the mean accuracy and the standard deviation of the estimates.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;scikit-learn also allows for a more concise implementation to evaluate the model using stratified $k$-fold cross-validation by using the $k$-fold cross-validation scorer.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1715768348809&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;from sklearn.model_selection import cross_val_score
scores = cross_val_score(estimator=pipe_lr,
                         X = X_train,
                         y = y_train,
                         cv = 10,
                         n_jobs=1)
print(f'CV accuracy scores: {scores}')
print(f'CV accuracy: {np.mean(scores):.3f}'
      f'+/- {np.std(scores):.3f}')&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;671&quot; data-origin-height=&quot;62&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b3Q1TI/btsHqkikYAK/a6Ve8K1AZvXaQKCrK95kl1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b3Q1TI/btsHqkikYAK/a6Ve8K1AZvXaQKCrK95kl1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b3Q1TI/btsHqkikYAK/a6Ve8K1AZvXaQKCrK95kl1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb3Q1TI%2FbtsHqkikYAK%2Fa6Ve8K1AZvXaQKCrK95kl1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;671&quot; height=&quot;62&quot; data-origin-width=&quot;671&quot; data-origin-height=&quot;62&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;Here, the '&lt;/span&gt;&lt;b&gt;pipe_lr&lt;/b&gt;'&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt; is defined as follows.&lt;/span&gt;&lt;span style=&quot;background-color: #ffffff; color: #0d0d0d; text-align: start;&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre id=&quot;code_1715768437167&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline

pipe_lr = make_pipeline(StandardScaler(),
                        PCA(n_components=2),
                        LogisticRegression())&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #9d9d9d;&quot;&gt;ref: &lt;span style=&quot;background-color: #ffffff; text-align: left;&quot;&gt;Raschka, Sebastian, et al.&amp;nbsp;&lt;/span&gt;&lt;i&gt;Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python&lt;/i&gt;&lt;span style=&quot;background-color: #ffffff; text-align: left;&quot;&gt;. Packt Publishing Ltd, 2022.&lt;/span&gt; &lt;/span&gt;&lt;/p&gt;</description>
      <category>Deep Learning</category>
      <author>elif</author>
      <guid isPermaLink="true">https://finite-elem.tistory.com/166</guid>
      <comments>https://finite-elem.tistory.com/166#entry166comment</comments>
      <pubDate>Mon, 13 May 2024 23:04:51 +0900</pubDate>
    </item>
  </channel>
</rss>