WLS2上のUbuntuでpython深層学習モデルtransformersを使ってみる

WSL2上のUbuntuでpython深層学習モデルtransformersを使ってみる

環境構築

2023.01.032023.01.10

Windowsでさっとtransformersの環境構築をする方法をメモで残します。chatGPTが話題の自然言語処理ですが、学習を開始するのはオープンソースのtransformersがお手軽です。今回はtransformersの環境構築から、使ってみるところまでやってみます。

環境構築
transdformersを使ってみる
最後に

環境構築

実施環境

環境構築の実施環境は以下の通り。
・Windows11
・WSL2インストール済み（WSL2環境の構築の記事はこちら→Link）
・Ubuntu22.04 on Windows（Windowsのアプリストアからダウンロード可能）
・Pythonは3.10（Ubuntu22.04のデフォルトのまま）
・VScodeはインストールしておいた方がいい。（公式ページからダウンロードして、インストール）
・Ubuntu用のVScodeの拡張機能は入れた方がいい。（参考→Link）

Ubuntu, venvの設定

transformersの公式ページのインストールガイドを参考に環境構築をしていきます。まずは、Pythonの基本的な環境を整えて、venvをインストールして仮想環境に入ります。Ubuntuのホームディレクトリに「.env」を仮想環境の入り口を作ります。次回からは、再起動のたびにホームディレクトリで「source .env/bin/activate」を実施して仮想環境に入ることにします。

$ sudo apt-get update
$ sudo apt -y update
$ sudo apt -y install python3-dev python3-pip python3-setuptools
$ pip install -U pip setuptools
$ source .profile
$ sudo apt install python3.10-venv
$ python3 -m venv .env
$ source .env/bin/activate

Installation

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

ライブラリのインストールとtransformersのテスト

必要なライブラリはパッケージ化されて、pipコマンドで簡単にインストールできます。インストールが終わったら、コード例を入力してテキスト分類が使えることを確認します。

$ pip install transformers
$ pip install transformers[torch]
$ pip install transformers[tf-cpu]
$ pip install transformers[flax]
$ python3 -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"
$ print(gen_text)
[{'label': 'POSITIVE', 'score': 0.9998704195022583}]

上記のコマンドでは”transformers”のライブラリのテキスト分類で”we love you”を”POSITIVE”だと判断させています。問題なく使えることを確認できました。

transdformersを使ってみる

ここからは、VScodeでtransformersを使ったPythonコードを作成して実行してみます。

$ code .

文章生成

“text-generation”で文章を生成してみます。

# 2つの文章にそれぞれ2パターンの文章生成を行う

from transformers import pipeline

generator = pipeline(task="text-generation")

def text_generator(prompt):
    text = generator(
        prompt,
        num_return_sequences=2,
    )  
    return text

prompt = ["The best place to visit in Japan is",
"I came to work and turned on my computer. Then ",]

text = text_generator(prompt)

for i in range(len(text)):
    for j in range(2):
        print(text[i][j]["generated_text"])

$ python3 code01.py

The best place to visit in Japan is Shinjuku station, which trains in the morning, with a total run time of 15 minutes.

The train from Osaka to Tokyo connects daily via Shio's subway system and connects to various local market

The best place to visit in Japan is near the Tohoku-to-Nagoya railway station, at which one of the only two train routes connects to Hiroshima.

I came to work and turned on my computer. Then  a while later I saw  a video of a different girl dancing, singing and doing a  big  hoot  from somewhere  I don't recall. A

I came to work and turned on my computer. Then  I saw a message from the driver saying that this time for security reasons  I would need to install some security software and reboot my computer. I went to sleep when the driver said to

“The best place to visit in Japan is”と”I came to work and turned on my computer. Then “に続く文章を各2通りずつtransformersに考えてもらいました。どちらも、内容はかなり微妙な文章ですが、ちゃんとそれらしい文章が生成しているのが分かりました。（実際はちゃんとファインチューニングをしたら、内容も洗練されるかと。。。）

画像認識

画像認識を試してみます。今回は画像の内容について、質問に答える「vqa」で試してみます。画像元はJAさんのホームページのニンジンの写真です。それを見せて、「何の野菜か？」「写真ににんじんは何本あるか？」を質問してみました。

認識する画像（https://life.ja-group.jp/food/shun/detail?id=9）

from transformers import pipeline

image="https://life.ja-group.jp/upload/food/vegetable/main/9_1.jpg"
questions = ["What are the vegetables?","How many carrots are in the picture?"]

vqa = pipeline(task="vqa")

for question in questions:
    preds = vqa(image=image, question=question)
    preds = [{"score": round(pred["score"], 4), "answer": pred["answer"]} for pred in preds]
    preds = [{pred["answer"]: round(pred["score"], 4)} for pred in preds]
    print(question)
    print(preds)

$ python3 code02.py

What are the vegetables?
[{'carrots': 0.9647}, {'carrot': 0.2542}, {'yes': 0.0135}, {'vegetables': 0.0034}, {'oranges': 0.0032}]
How many carrots are in the picture?
[{'8': 0.3243}, {'6': 0.3096}, {'7': 0.281}, {'5': 0.1736}, {'9': 0.1342}]

認識結果はスコアと答えで生成され、スコアの高い順にでてきます。ちゃんと、「carrots」と「8」がハイスコアの解答になっています。上の写真を見ると人でもにんじんの本数はちょっと分かりにくいですよね。

日本語特化言語処理モデルGPT-NeoX-Japanese

日本語で学習した言語処理モデルのGPT-NeoX-Japaneseを使ってみます。公式ページの例文をベースに作成しています。”日本に来てぜひ立ち寄ったほうがいい観光地は、”に続く文章を生成しています。

from transformers import GPTNeoXJapaneseForCausalLM, GPTNeoXJapaneseTokenizer

model = GPTNeoXJapaneseForCausalLM.from_pretrained("abeja/gpt-neox-japanese-2.7b")
tokenizer = GPTNeoXJapaneseTokenizer.from_pretrained("abeja/gpt-neox-japanese-2.7b")

prompt = "日本に来てぜひ立ち寄ったほうがいい観光地は、"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

gen_tokens = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.9,
    max_length=100,
)
gen_text = tokenizer.batch_decode(gen_tokens, skip_special_tokens=True)[0]

print(gen_text)

$ python3 code03.py

日本に来てぜひ立ち寄ったほうがいい観光地は、やはり東京でしょうね。

デフォルトの状態だとうまくいかないこともありますが、今回はうまく意味が通る文章が作れました。用途に応じてチューニングすることによって、もっと良くなってくると思います。

最後に

chatGPTが話題の自然言語処理ですが、transformersではオープンソースの自然言語処理モデルが使えます。デフォルトの状態でもかなりの精度がありますので、１から学習するより効率よくモデルを作成できそうです。