{"id":3586,"date":"2022-09-09T14:24:28","date_gmt":"2022-09-09T05:24:28","guid":{"rendered":"https:\/\/weseek.co.jp\/tech\/?p=3586"},"modified":"2023-04-17T10:45:32","modified_gmt":"2023-04-17T01:45:32","slug":"%e3%80%90%e9%9f%b3%e5%a3%b0%e8%aa%8d%e8%ad%98%e3%80%91python-x-cmu-sphinx-%e3%81%a7%e9%9f%b3%e5%a3%b0%e8%aa%8d%e8%ad%98%e5%85%a5%e9%96%80","status":"publish","type":"post","link":"https:\/\/weseek.co.jp\/tech\/3586\/","title":{"rendered":"\u3010\u97f3\u58f0\u8a8d\u8b58\u3011Python x CMU Sphinx \u3067\u97f3\u58f0\u8a8d\u8b58\u5165\u9580"},"content":{"rendered":"<p>\u7686\u3055\u3093\u3053\u3093\u306b\u3061\u306f\uff01<a href=\"https:\/\/weseek.co.jp\/ja\/\">WESEEK<\/a> \u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u30a8\u30f3\u30b8\u30cb\u30a2\u306e <a href=\"https:\/\/twitter.com\/TaichiMasuyama\">\u5897\u5c71<\/a> \u3067\u3059\u3002<\/p>\n<p>\u4eca\u56de\u306e\u30d6\u30ed\u30b0\u3067\u306f\u3001Python \u3068 \u97f3\u58f0\u8a8d\u8b58\u30e9\u30a4\u30d6\u30e9\u30ea CMU Sphinx \u3092\u4f7f\u3063\u3066\u7c21\u5358\u306a\u97f3\u58f0\u8a8d\u8b58\u3092\u3084\u3063\u3066\u307f\u307e\u3059\u3002<\/p>\n<p><!--more--><\/p>\n<h1>\u76ee\u6b21<\/h1>\n\n<h1>\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9<\/h1>\n<p>\u52d5\u304f\u30b3\u30fc\u30c9\u306f <a href=\"https:\/\/github.com\/hakumizuki\/python_sphinx_sample\">https:\/\/github.com\/hakumizuki\/python_sphinx_sample<\/a> \u306b\u3042\u308a\u307e\u3059\u3002<\/p>\n<p>\u3053\u3061\u3089\u306e\u30ec\u30dd\u30b8\u30c8\u30ea\u3092\u30d9\u30fc\u30b9\u306b\u8aac\u660e\u3057\u3066\u3044\u304f\u306e\u3067\u3001\u5b9f\u969b\u306b\u624b\u5143\u3067\u52d5\u304b\u3057\u3066\u898b\u305f\u3044\u65b9\u306f <code>git clone https:\/\/github.com\/hakumizuki\/python_sphinx_sample<\/code> \u3092\u304a\u9858\u3044\u3057\u307e\u3059\u3002<\/p>\n<p>\u307e\u305f\u8aac\u660e\u3067\u306f devcontainer \u3092\u4f7f\u7528\u3057\u307e\u3059\u304c\u3001\u30de\u30a4\u30af\u306a\u3069\u306e\u5916\u90e8\u6a5f\u5668\u3092\u4f7f\u3046\u3068\u304d\u306b\u306f\u30b3\u30f3\u30c6\u30ca\u3088\u308a\u5b9f\u6a5f\u306e\u65b9\u304c\u4f7f\u3044\u3084\u3059\u3044\u3068\u601d\u3044\u307e\u3059\u306e\u3067\u3001\u305d\u306e\u5834\u5408\u306f Dockerfile \u306e\u4f9d\u5b58\u3057\u3066\u3044\u308b\u30d1\u30c3\u30b1\u30fc\u30b8\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u90e8\u5206\u306a\u3069\u3092\u53c2\u8003\u3057\u3066\u76f4\u63a5 OS \u4e0a\u3067\u5b9f\u884c\u3057\u3066\u307f\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n<h1>SpeechRecognition \u30e9\u30a4\u30d6\u30e9\u30ea<\/h1>\n<p>CMU Sphinx \u3092 Python \u3067\u6271\u3046\u306b\u306f\u3001\u3055\u307e\u3056\u307e\u306a\u97f3\u58f0\u8a8d\u8b58 API \u3092\u53f8\u308b <a href=\"https:\/\/pypi.org\/project\/SpeechRecognition\/\">SpeechRecognition \u30e9\u30a4\u30d6\u30e9\u30ea<\/a>\u3092\u4f7f\u7528\u3057\u307e\u3059\u3002<\/p>\n<p>CMU Sphinx \u306f OSS \u3068\u3057\u3066\u958b\u767a\u3055\u308c\u3066\u3044\u308b\u97f3\u58f0\u8a8d\u8b58\u30c4\u30fc\u30eb\u3067\u3059\u3002Google Cloud Speech API \u306a\u3069\u3068\u9055\u3063\u3066\u5b8c\u5168\u306b\u30aa\u30d5\u30e9\u30a4\u30f3\u3067\u52d5\u304f\u306e\u3067\u30a4\u30f3\u30bf\u30fc\u30cd\u30c3\u30c8\u3092\u5fc5\u8981\u3057\u306a\u3044\u3053\u3068\u304c\u5f37\u307f\u306e\u4e00\u3064\u3060\u3068\u601d\u3044\u307e\u3059\u3002\u8a00\u8a9e\u30e2\u30c7\u30eb\u3092\u7528\u610f\u3059\u308b\u3053\u3068\u3067\u3069\u3093\u306a\u8a00\u8a9e\u3067\u3082\u97f3\u58f0\u8a8d\u8b58\u3067\u304d\u307e\u3059\u3002\u8a73\u3057\u304f\u306f <a href=\"https:\/\/cmusphinx.github.io\/\">\u3053\u3061\u3089<\/a><\/p>\n<p>\u4ed6\u306e\u624b\u6bb5\u3092\u8a66\u3057\u3066\u307f\u305f\u3044\u65b9\u306f <a href=\"https:\/\/pypi.org\/project\/SpeechRecognition\/\">\u3053\u3061\u3089<\/a> \u3092\u53c2\u8003\u306b\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n<h1>\u74b0\u5883\u69cb\u7bc9<\/h1>\n<ol>\n<li>Docker \u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\n<ul>\n<li><a href=\"https:\/\/www.docker.com\/\">https:\/\/www.docker.com\/<\/a><\/li>\n<\/ul>\n<\/li>\n<li>VSCode \u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\n<ul>\n<li><a href=\"https:\/\/code.visualstudio.com\/\">https:\/\/code.visualstudio.com\/<\/a><\/li>\n<\/ul>\n<\/li>\n<li><code>$ git clone https:\/\/github.com\/hakumizuki\/python_sphinx_sample<\/code><\/li>\n<li><code>$ cd .\/python_sphinx_sample<\/code><\/li>\n<li><code>$ code .<\/code><\/li>\n<li>Ctrl+P or Command+P \u3092\u62bc\u3057\u3001Reopen in container \u3068\u5165\u529b\u3057\u3066\u51fa\u3066\u304d\u305f\u5019\u88dc\u3092\u30af\u30ea\u30c3\u30af<\/li>\n<\/ol>\n<p>\u3053\u308c\u3067 devcontainer \u304c\u8d77\u52d5\u3057\u3066\u74b0\u5883\u69cb\u7bc9\u304c\u5b8c\u4e86\u3057\u307e\u3057\u305f\u3002\u3067\u306f\u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u898b\u3066\u3044\u304d\u307e\u3057\u3087\u3046\u3002<\/p>\n<h1>\u30d7\u30ed\u30b0\u30e9\u30e0\u8aac\u660e<\/h1>\n<p>\u4e3b\u306b\u4f7f\u7528\u3059\u308b\u30e9\u30a4\u30d6\u30e9\u30ea\u306f <a href=\"https:\/\/pypi.org\/project\/SpeechRecognition\/\">SpeechRecognition<\/a> \u3068\u3044\u3046 Python \u306e\u97f3\u58f0\u8a8d\u8b58\u7528\u30e9\u30a4\u30d6\u30e9\u30ea\u3067\u3059\u3002<\/p>\n<p>\u30d5\u30a1\u30a4\u30eb\u30d1\u30b9\u306a\u3069\u306e\u5b9a\u6570\u306f <code>constants.py<\/code> \u306b\u3001\u97f3\u58f0\u8a8d\u8b58\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f <code>recognize.py<\/code> \u306b\u66f8\u304d\u307e\u3057\u305f\u3002<\/p>\n<p><a href=\"https:\/\/github.com\/hakumizuki\/python_sphinx_sample\/blob\/main\/src\/constants.py\">constants.py<\/a><\/p>\n<pre><code class=\"language-python\">import os\n\nBASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))\n\nMP3_TEMP_OUT=f&#039;{BASE_DIR}\/audios\/speech.temp.mp3&#039;\nWAV_OUT=f&#039;{BASE_DIR}\/audios\/speech.wav&#039;<\/code><\/pre>\n<p><a href=\"https:\/\/github.com\/hakumizuki\/python_sphinx_sample\/blob\/main\/src\/recognize.py\">recognize.py<\/a><\/p>\n<pre><code class=\"language-python\">import speech_recognition as sr\nimport sys\n\nfrom constants import BASE_DIR, WAV_OUT\n\ndef main():\n    mode = None\n\n    # *1\n    # Get mode from args\n    if len(sys.argv) &gt; 2:\n        raise Exception(f&#039;Too many args. (pass &quot;mic&quot; or don\\&#039;t pass to use &quot;{WAV_OUT}&quot; file)&#039;)\n    elif len(sys.argv) == 2:\n        mode = sys.argv[1]\n\n    recognize_result = recognize(mode)\n\n    print(recognize_result)\n\ndef recognize(mode):\n    text = None\n    # *1\n    audio_src = sr.Microphone() if mode == &#039;mic&#039; else sr.AudioFile(WAV_OUT)\n\n    with audio_src as audio_file:\n        # *2\n        # Initialize a Recognizer\n        r = sr.Recognizer()\n        # Remove noise\n        # r.adjust_for_ambient_noise(audio_file, duration=10)\n        # Convert audio source to a recognizable object\n        recognizable = r.record(audio_file, duration=10)\n        text = r.recognize_sphinx(recognizable)\n\n    return text<\/code><\/pre>\n<h3>*1<\/h3>\n<p><code>python recognize.py &lt;mode&gt;<\/code> \u3068\u3044\u3046\u4f7f\u3044\u65b9\u3092\u60f3\u5b9a\u3057\u3066\u3044\u307e\u3059\u3002mode \u306b\u306f <code>mic<\/code> \u3092\u6e21\u3059\u3068\u30de\u30a4\u30af\u304b\u3089\u5165\u3063\u305f\u97f3\u58f0\u3092\u3092\u97f3\u58f0\u30bd\u30fc\u30b9\u3068\u3057\u3066\u4f7f\u7528\u3057\u3001\u4f55\u3082\u6e21\u3055\u306a\u3044\u3068 constants.py \u306e <code>WAV_OUT<\/code> \u306b\u3042\u308b\u97f3\u58f0\u30d5\u30a1\u30a4\u30eb\u3092\u97f3\u58f0\u30bd\u30fc\u30b9\u3068\u3057\u3066\u4f7f\u7528\u3057\u307e\u3059\u3002<\/p>\n<p>\u30de\u30a4\u30af\u3092\u4f7f\u7528\u3059\u308b\u306b\u306f\u4ee5\u4e0b\u3092\u53c2\u8003\u306b\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n<ul>\n<li>\u30b3\u30f3\u30c6\u30ca\u306e\u5916\u3067\u30b9\u30af\u30ea\u30d7\u30c8\u3092\u5b9f\u884c\u3059\u308b<\/li>\n<li><a href=\"https:\/\/stackoverflow.com\/questions\/45700653\/can-my-docker-container-app-access-the-hosts-microphone-and-speaker-mac-wind\">\u3053\u306e\u8a18\u4e8b<\/a> \u306a\u3069\u3092\u53c2\u8003\u306b\u30b3\u30f3\u30c6\u30ca\u4e0a\u3067\u30de\u30a4\u30af\u3092\u4f7f\u3048\u308b\u3088\u3046\u306b\u8a2d\u5b9a\u3059\u308b<\/li>\n<\/ul>\n<h3>*2<\/h3>\n<ol>\n<li><code>Recognizer<\/code> \u30a4\u30f3\u30b9\u30bf\u30f3\u30b9\u3092\u751f\u6210<\/li>\n<li><code>record<\/code> \u30e1\u30bd\u30c3\u30c9\u3067\u97f3\u58f0\u30d5\u30a1\u30a4\u30eb\u304b\u3089\u97f3\u58f0\u8a8d\u8b58\u7528\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u751f\u6210<\/li>\n<li><code>recognize_sphinx<\/code> \u30e1\u30bd\u30c3\u30c9\u306b\u97f3\u58f0\u8a8d\u8b58\u7528\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u6e21\u3057\u3066\u5b9f\u884c\u3059\u308b\u3053\u3068\u3067\u97f3\u58f0\u8a8d\u8b58\u304c\u5b9f\u884c\u3055\u308c\u308b<\/li>\n<\/ol>\n<h1>\u52d5\u304b\u3057\u3066\u307f\u308b<\/h1>\n<p>\u4eca\u56de\u306f\u30ec\u30dd\u30b8\u30c8\u30ea\u306b\u5185\u5305\u3057\u3066\u3044\u308b speech.py \u3067\u97f3\u58f0\u30d5\u30a1\u30a4\u30eb\u3092\u4f5c\u6210\u3057\u3066\u3001\u305d\u308c\u3092\u8a8d\u8b58\u3055\u305b\u3066\u307f\u305f\u3044\u3068\u601d\u3044\u307e\u3059\u3002speech.py \u306f\u82f1\u8a9e\u306e\u6587\u5b57\u5217\u3092\u53d7\u3051\u53d6\u3063\u3066\u305d\u308c\u3092\u97f3\u58f0\u30d5\u30a1\u30a4\u30eb\u306b\u5909\u63db\u3057\u307e\u3059\u3002<\/p>\n<ol>\n<li><code>$ mkdir .\/audios<\/code>\n<ul>\n<li>src \u3068\u540c\u3058\u968e\u5c64<\/li>\n<\/ul>\n<\/li>\n<li><code>$ python speech.py &quot;apple banana&quot;<\/code>\n<ul>\n<li>\u597d\u304d\u306a\u82f1\u8a9e\u306b\u5909\u66f4\u3067\u304d\u307e\u3059<\/li>\n<\/ul>\n<\/li>\n<li><code>$ python recognize.py<\/code><\/li>\n<\/ol>\n<pre><code>$ python recognize.py\napple banana<\/code><\/pre>\n<p>\u3068\u8868\u793a\u3055\u308c\u308c\u3070\u6210\u529f\u3067\u3059\u3002<\/p>\n<h1>\u7d42\u308f\u308a\u306b<\/h1>\n<p>\u3053\u3053\u307e\u3067\u304a\u8aad\u307f\u3044\u305f\u3060\u304d\u3042\u308a\u304c\u3068\u3046\u3054\u3056\u3044\u307e\u3057\u305f\u3002<\/p>\n<p>\u8cea\u554f\u7b49\u3042\u308a\u307e\u3057\u305f\u3089 <a href=\"https:\/\/twitter.com\/TaichiMasuyama\">\u5897\u5c71\u306e Twitter<\/a> \u306b DM \u9001\u3063\u3066\u3044\u305f\u3060\u3051\u308c\u3070\u7b54\u3048\u3089\u308c\u308b\u7bc4\u56f2\u3067\u304a\u7b54\u3048\u3057\u307e\u3059\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u7686\u3055\u3093\u3053\u3093\u306b\u3061\u306f\uff01WESEEK \u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u30a8\u30f3\u30b8\u30cb\u30a2\u306e \u5897\u5c71 \u3067\u3059\u3002 \u4eca\u56de\u306e\u30d6\u30ed\u30b0\u3067\u306f\u3001Python \u3068 \u97f3\u58f0\u8a8d\u8b58\u30e9\u30a4\u30d6\u30e9\u30ea CMU Sphinx \u3092\u4f7f\u3063\u3066\u7c21\u5358\u306a\u97f3\u58f0\u8a8d\u8b58\u3092\u3084\u3063\u3066\u307f\u307e\u3059\u3002<\/p>\n","protected":false},"author":3,"featured_media":3588,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[118,160],"tags":[],"_links":{"self":[{"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/posts\/3586"}],"collection":[{"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/comments?post=3586"}],"version-history":[{"count":9,"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/posts\/3586\/revisions"}],"predecessor-version":[{"id":3624,"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/posts\/3586\/revisions\/3624"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/media\/3588"}],"wp:attachment":[{"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/media?parent=3586"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/categories?post=3586"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/weseek.co.jp\/tech\/wp-json\/wp\/v2\/tags?post=3586"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}