研究者業績

荒井 隆行

アライ タカユキ  (Arai Takayuki)

基本情報

所属
上智大学 理工学部情報理工学科 教授
学位
工学士(上智大学)
工学修士(上智大学)
博士(工学)(上智大学)

連絡先
araisophia.ac.jp
研究者番号
80266072
J-GLOBAL ID
200901064275514612
researchmap会員ID
1000260131

<国内>
2008年4月  上智大学理工学部情報理工学科 教授(現在に至る)
2006年4月  上智大学理工学部電気・電子工学科 教授
2000年4月  上智大学理工学部電気・電子工学科 助教授
1998年4月  上智大学理工学部電気・電子工学科 専任講師
1994年4月  上智大学理工学部電気・電子工学科 助手
1994年3月  上智大学大学院理工学研究科電気・電子工学専攻博士後期課程 修了
1991年3月  上智大学大学院理工学研究科電気・電子工学専攻博士前期課程 修了
1989年3月  上智大学理工学部電気・電子工学科 卒業

<国外>
2003年10月~2004年9月  アメリカMassachusetts Institute of Technology客員研究員
2000年8月、2001年8月、2002年8月、ならびに 2003年10月~2004年9月
       アメリカ Massachusetts Institute of Technology 客員研究員                 
2001年2月  オランダ Max Planck Institute for Psycholinguistics 客員研究員
2000年8月  アメリカ Massachusetts Institute of Technology 客員研究員
1997年1月~1998年3月 / 1998年8月ならびに1999年8月
       アメリカ California 大学 Berkeley 校付属研究機関
        International Computer Science Institute 客員研究員
1992年9月~1993年8月ならびに1995年6月~1996年12月
        アメリカ Oregon Graduate Institute of Science and Technology 客員研究員

音声コミュニケーションに関わる一連の事象は「ことばの鎖(Speech Chain)」と呼ばれ、音声科学・聴覚科学における基本的な概念となっており、その音声コミュニケーションに関して音声科学・聴覚科学、音響学、音響音声学などに関わる科学的側面とその応用に主な焦点を当てて研究を続けてきている。そして、音に関わるあらゆる側面にも研究の範囲を拡大している。カバーする範囲は、次のような幅の広い学際的な研究分野を含む:
・音響学と音響教育(例:声道模型)
・音響音声学を中心とする言語学分野(音声学・音韻論)とその教育応用(応用言語)
・音声生成を含む音声科学と音声知覚を含む聴覚科学、音や音声を含む認知科学
・実環境での音声知覚・音声明瞭度、音声信号処理・音声強調
・音声に関する福祉工学・障害者支援、障害音声の音響分析や聴覚障害者・高齢者の音声生成や音声知覚
・実時間信号処理を含む音声処理アルゴリズムの開発、音に関わるシステムやアプリの開発
・音声の話者性
・その他、音に関する研究全般など

(研究テーマ)
音響学と音響教育(声道模型を含む)
音響音声学を中心とする言語学分野(音声学・音韻論)とその教育応用(応用言語)
音声生成を含む音声科学と音声知覚を含む聴覚科学、音や音声を含む認知科学
実環境での音声知覚・音声明瞭度、音声信号処理・音声強調
音声に関する福祉工学・障害者支援、障害音声の音響分析や聴覚障害者・高齢者の音声生成や音声知覚
実時間信号処理を含む音声処理アルゴリズムの開発、音に関わるシステムやアプリの開発
音声の話者性

(共同・受託研究希望テーマ)
音情報処理
音声言語情報処理
聴覚情報処理


論文

 601
  • Takayuki Arai, Yoshiaki Murakami, Nahoko Hayashi, Nao Hodoshima, Kiyohiro Kurisu
    Acoustical Science and Technology 28(6) 438-441 2007年  
    Researchers had investigated the correlation between the intelligibility of speech in reverberation and the amount of overlap-masking (OLM) due to reverberation. A high correlation existed between the results of a perceptual experiment and the values of the proposed intelligibility measure and SOR, which is defined as the signal-to-OLM ratio. The intelligibility of speech in reverberation was inversely correlated with the amount of overlap-masking. During the steady-state suppression technique, overlap-masking is reduced by estimating and suppressing steady-state portions of speech that have high energy but are less important for speech perception such as the nuclei of syllables. The advantages of using the proposed measures are that it reflects the reverberation characteristics of a room, as contained in the impulse response of the room, and it also reflects the characteristics of the speech signal itself and the effect of any pre-processes.
  • Takayuki Arai, Yuki Nakata, Nao Hodoshima, Kiyohiro Kurisu
    Acoustical Science and Technology 28(4) 282-285 2007年  
    The effects of steady state suppression after slowing the speaking rate of a speech signal was investigated. It was observed that slowing the speech rate improves speech intelligibility in a reverberant environment. It was also observed that speaking slowly helps to increase speech intelligibility, particularly in a large hall with a long reverberation time. Slowing speech by isolating each syllable would be more effective for improving speech intelligibility. Due to increase in overlap masking it is difficult to understand the speech clearly. Steady state suppression is proposed to reduced the overlap masking as a preprocess for speech signals in reverberant environment. Artificial reverberant environments were achieved by convolving speech samples with impulse response. Pair wise comparison also showed significant improvements by steady state suppression.
  • Kanae Amino, Takayuki Arai
    Acoustical Science and Technology 28(2) 128-130 2007年  
    In this study, we conducted a perceptual speaker identification experiment in order to examine the effects of speaker-listener familiarity and of the stimulus content. We used the same materials as those used in our previous study [6], where familiar listeners identified the speakers. The results showed that familiar listeners performed significantly better than naive listeners however, the overall effects of the stimulus content were similar between familiar and naive listeners. The nasals /na/ and /nja/ were particularly effective for speaker identification, and the identification score differences among the coronal nasals and the labial nasal was again observed in this study. © 2007 The Acoustical Society of Japan.
  • Nao Hodoshima, Yusuke Miyauchi, Keiichi Yasu, Takayuki Arai
    Acoustical Science and Technology 28(1) 53-55 2007年  
    The effect of steady-state suppression on speech intelligibility for an elderly person under various reverberation conditions was studied. Processed and unprocessed speech materials were reproduced using three reverberant conditions, reverberation time (RTs) of 0.7, 1.0 and 1.2s represented by an impulse response measured in Hamming Hall in Tokyo. The computer-controlled listening test was conducted in a sound-treated room and the sound level was adjusted to a comfortable level for the participant before the beginning of trials. The degree of improvement in perception produced by steady-state suppression for each reverberant condition was different from the elderly participant and young normal hearing participants in each reverberant condition. With RTs of 0.7 and 1.0 s, the participant achieved higher scores for steady-state suppressed signals than for unprocessed signals.
  • Takayuki Arai, Natasha Warner, Steven Greenberg
    Acoustical Science and Technology 28(1) 46-48 2007年  
    An analysis of pronunciation variations of Japanese component of the Oregon Graduate Institute Multi-Language Telephone Speech (OGI-TS) Corpus is presented. These variations include reduction or deletion, and frequencies of occurrence and duration of both vowels and consonants in corpus. This corpus contains 90 calls and each call was uttered by a unique adult speaker. Filled pauses, hesitations and other instances of interruption in the speech stream were also transcribed. The non-high vowel devoicing is common in this corpus than would be anticipated on the basis of the published literature. In Japanese, the main difference between careful and spontaneous speech is in the proportion of vowel devoicing and deletion. The variations in pronunciation of consonants in Japanese includes glottal fricative, nasalization of vowels before nasals, and other forms of consonant reduction.
  • Takayuki Arai
    Acoustical Science and Technology 28(3) 190-201 2007年  
    In this paper, we present and discuss an educational system in the fields of acoustics and speech science using a series of physical models of the human vocal tract. Because education in acoustics is relevant for several fields related to speech communication, it hosts students from a variety of educational backgrounds. Moreover, we believe that an education in acoustics is important for students of different ages: college, high school, middle school, and even elementary school students. Because of the varied student populations, we develop an educational system that instructs students intuitively and effectively and consists of the following models: lung models, an artificial larynx, Arai's models (cylinder and plate type models), Umeda and Teranishi's model (a variable-shape model), and head-shaped models. These models effectively demonstrate several principal aspects of speech production, such as phonation, source-filter theory, the relationship between vocal-tract shape/ tongue movement and vowel quality, and nasalization of vowels. We have confirmed that combining the models in an effective way produces complete education in the acoustics of speech production. The examinations and questionnaire surveys conducted before and after using our proposed system revealed that the learners' understanding of what improves with the use of the system. The system is also effective for voice and articulatory training in speech pathology and language learning. © 2007 The Acoustical Society of Japan.
  • 岩崎純二, 片岡竜太, 山下洋介, 春日梨恵, 安啓一, 荒井隆行, 新谷悟
    電子情報通信学会技術研究報告. SP, 音声 106(443) 49-54 2006年12月  
    健常者4名(男性2名と女性2名)の/impee/発音時の4次元MRI撮像を行い,3つの鼻咽腔閉鎖パターンを認めた.軟口蓋,鼻咽腔開存部および軟口蓋の挙上に関連する口蓋帆挙筋(LVP筋)の安静時及び母音/i/と子音/p/発音時の位置と形状を,正中矢状断面と水平断面で観察した.その結果,安静時にLVP筋は楕円形状をしており,その長軸は前後方向を向いていたが,軟口蓋が最大挙上すると鼻咽腔開存部に向かって回転した(タイプA)また軟口蓋の中等度の挙上ではLVP筋の長軸は近心に平行移動した(タイプB).(1)Coronal patternでは/i/と//p/発音時ともにタイプA,(2)Circular patternでは/p/発音時でタイプA,/i/発音時にタイプB,(3)Circular with Passavant's ridge patternでは/i/と/p/発音時,共にタイプBの運動がみられた.
  • Kanako Ueno, Takayuki Arai, Fumiaki Satoh, Akira Nishimura, Koichi Yoshihisa
    J. Acoust. Soc. Am 120(5, Pt.2) 3116 2006年11月  
  • 小松雅彦, 荒井隆行
    日本音声学会全国大会予稿集 195-200 2006年9月  
  • 網野加苗, 荒井隆行
    日本音響学会研究発表会講演論文集 273-274 2006年9月  
  • 向奈津美, 北口直, 金寺登, 荒井隆行, 藤樫佑樹, 古賀綾子, 吉井順子, 船田哲男
    日本音響学会研究発表会講演論文集 263-264 2006年9月  
  • 古賀綾子, 藤樫佑樹, 荒井隆行, 金寺登, 吉井順子
    日本音響学会研究発表会講演論文集 261-262 2006年9月  
  • 加島慎平, 飯田朱美, 安啓一, 荒井隆行, 菅原勉
    日本音響学会研究発表会講演論文集 251-252 2006年9月  
  • 荒井 隆行, K. Ohta, K. Yasu
    DSPS教育者会議予稿集, 2006 55-58 2006年9月  
  • Hodoshima N, Arai T, Kusumoto A, Kinoshita K
    The Journal of the Acoustical Society of America 119(6) 4055-4064 2006年6月  査読有り
  • Nao Hodoshima, Takayuki Arai, Akiko Kusumoto, Keisuke Kinoshita
    J. Acoust. Soc. Am 119(6) 4055-4064 2006年6月  
  • 村上善昭, 程島奈緒, 中田有貴, 林奈帆子, 宮内裕介, 荒井隆行, 栗栖清浩
    日本音響学会研究発表会講演論文集 649-650 2006年3月  
  • 安啓一, 小林敬, 荒井隆行, 八田ゆかり, 南畑伸至, 進藤美津子
    日本音響学会研究発表会講演論文集 487-488 2006年3月  
  • 網野加苗, 菅原勉, 荒井隆行
    日本音響学会研究発表会講演論文集 363-364 2006年3月  
  • 平井沢子, 安啓一, 荒井隆行, 飯高京子
    電子情報通信学会技術研究報告. SP, 音声 105(686) 17-22 2006年3月  
  • 安啓一, 荒井隆行, 進藤美津子
    電子情報通信学会技術研究報告. SP, 音声 105(686) 1-4 2006年3月  
  • 平井沢子, 安啓一, 荒井隆行, 飯高京子
    音声言語医学 47(1) 75-75 2006年  
  • Nao Hodoshima, Dawn Behne, Takayuki Arai
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 873-+ 2006年  
    This study investigated whether the steady-state suppression method proposed by Arai et al. (2001, 2002) improved consonant identification for nonnative listeners in reverberation. It also compared the effect of steady-state suppression on consonant identification by native and nonnative listeners in reverberation. We used steady-state suppression as a preprocessing technique which processes speech signals before they are radiated from loudspeakers in order to reduce the amount of overlap-masking. Participants were 24 native English (native listeners) and 24 Japanese speakers (nonnative listeners), both with normal hearing. A diotic Modified Rhyme Test was conducted with and without steady-state suppression for reverberation times of 0.4, 0.7 and 1.1 s and a non-reverberant condition. The results showed that native listeners performed better than nonnative listeners, and that the mean percentage of correct answers in initial consonants was higher than in final consonants. The results also showed that processed and unprocessed speech was comparable for word initial and final consonants. These findings indicate that parameters of steady-state suppression would need adjustment to accommodate speech materials and reverberant conditions. They also suggest that the difficulties that nonnative listeners have might not be due to the actual acoustic-phonetic information from the signal.
  • ARAI TAKAYUKI, K. Amino, T. Sugawara
    Proc. of the Western Pacific Acoustics Conference (WESPAC) 2006年  
  • ARAI TAKAYUKI, N. Hodoshima
    Proc. of the Western Pacific Acoustics Conference (WESPAC) 2006年  
  • Takayuki Arai, Fumiaki Satoh, Akira Nishimura, Kanako Ueno, Koichi Yoshihisa
    Acoustical Science and Technology 27(6) 344-348 2006年  
    Many demonstrations for education in acoustics have been developed in Japan as well as outside the country. Since 1997, the Technical Committee on Education in Acoustics of the Acoustical Society of Japan has been investigating and discussing education in acoustics in Japan. In this review, some of the educational tools and demonstrations in acoustics are introduced. They are all designed to help us visualize and hear different phenomena and to understand abstract theories in a more intuitive way. The work that has been carried out includes some exciting demonstrations in acoustics by the high-school physics teachers' "Stray Cats Group," some visual and aural demonstrations for architectural acoustics, a technical course called "Technical Listening Training," a WWW-based training system, and physical models of the human vocal tract.
  • Kaoru Ashihara, Akira Nishimura, Takayuki Arai
    Acoustical Science and Technology 27(6) 317-317 2006年  
  • Takayuki Arai, Keiichi Yasu, Takahito Goto
    Acoustical Science and Technology 27(6) 393-395 2006年  
    A modern pattern playback, which can serve to be very useful in pedagogical applications, was implemented. Two simple algorithms were proposed for digital pattern playback that included the AM method, based on the concept of amplitude modulation (AM), and the FFT method, based on the concept of fast Fourier transform (FFT). A simple system with the FFT method was implemented for real-time processing systems, by capturing a time slice of the input spectrogram in each frame, computing the waveform of the corresponding glottal cycle at that time frame by the inverse FFT, and by producing an acoustic signal based on the glottal waveform. This real-time aspect is very important in a pedagogical situation as the combination of the simultaneous sensations of tactility, somatosensory and auditory perception helps learners to understand the phenomenon more naturally, easily and intuitively.
  • Takayuki Arai
    Acoustical Science and Technology 27(6) 384-388 2006年  
    A sliding three-tube (S3T) model as an implementation of a physical model which varies the constriction position in a three-tube resonator, has been proposed. This three-tube model uses a simple mechanism to produce several different vowels. This model is an idealized system of coupled resonators and can be viewed as a tube having a uniform area function with a single narrow constriction. The S3T model consists of two parts, the outer and inner cylinders. The outer cylinder is a uniform tube with a constant diameter, while the inner cylinder has much shorter length and diameter. The S3T model is highly suitable for hands-on activities in acoustics education. The sound source for this model can be varied such as an electrolarynx, another type of artificial larynx such as whistle type, or driver unit of a horn speaker. This model can be used for many activities from science workshops for children to demonstrations of quantal theory for graduate students.
  • Takayuki Arai
    Acoustical Science and Technology 27(5) 298-301 2006年  
    The acoustic cue parsing between nasality and breathiness in speech perception by the listener was investigated. The main effect of the nasalization is the perturbation of low-frequency spectrum. Nasalty and breathiness have several acoustic cues that are strongly correlated to each other. Acoustic parameters are correlated with subjective judgments of breathiness including the degree of aspiration noise intruding in frequencies above 1.5 kHz in vowels, and the relative strength of the fundamental component. The resulting acoustic signal might contain these cues when a vowel sound with a lowered velum or a wide-spread glottis is produced. The perceptual experiments resulted that a listener parses these cues for nasality and breathiness. Some of the tendencies include the perceived nasality increases as the spacing of the nasal nose and zero becomes wider, perceived breathiness increases as aspiration increases, and with strong aspiration, breathiness is higher as open quotient increases.
  • Kanae Amino, Tsutomu Sugawara, Takayuki Arai
    Acoustical Science and Technology 27(4) 233-235 2006年  
    A human speaker identification test was conducted to find out the differences in the effectiveness of using various Japanese sounds in identifying the speakers. The stimuli used in the experiment was also analyzed in order to explain these differences in terms of acoustical distances. Sounds were evaluated in terms of the spectral distances in order to explain the differences among the stimuli in the perception test acoustically. It was reported that the recognition rate was improved by considering the individualities of the speakers in the oro-nasal coupling and by using the weighted linear scale spectral properties. The result suggest that the individualities of the speakers were reflected more in the spectra of the nasal sounds than in those of the oral sounds and that the listeners perceived these individualities when they identify the speakers.
  • Takayuki Arai
    Acoustical Science and Technology 27(2) 111-113 2006年  
    The lung model and head-shaped models with a visible vocal tract are described as effective educational tools in acoustics. The lungs model with whistle type artificial larynx with the use of bellows describes the human respiratory system. Head shaped model visualizes the position of the vocal tract in the head describing the modal phonation. The content is helpful to understand speech production, musicology, speech pathology, and language learning. The description on Umeda and Teranishi's model, Arai's cylinder type and plate model, fixed model, and manipulative tongue model, and the head shaped model with nasal cavities support the experiment giving a clear view on the speech production, speech mechanism, and voice modulation.
  • 大木衣里子, 原恵子, 飯高京子, 進藤美津子, 荒井隆行
    コミュニケーション障害学 22(3) 204 2005年12月  
  • 荒井 隆行, 後藤崇公, 安啓一
    第7回DSPS教育者会議予稿集 91-94 2005年9月  
  • 荒井 隆行, 竹内京子
    日本音声学会全国大会予稿集 179-184 2005年9月  
  • 荒井 隆行
    日本音声学会全国大会予稿集 3 2005年9月  
  • 中田有貴, 村上善昭, 林奈帆子, 宮内裕介, 程島奈緒, 荒井隆行, 栗栖清浩
    日本音響学会研究発表会講演論文集 693-694 2005年9月  
  • 程島奈緒, 荒井隆行
    日本音響学会研究発表会講演論文集 607-608 2005年9月  
  • 安啓一, 荒井隆行, 小林敬, 進藤美津子
    日本音響学会研究発表会講演論文集 517-518 2005年9月  
  • 網野加苗, 菅原勉, 荒井隆行
    日本音響学会研究発表会講演論文集 431-432 2005年9月  
  • 荒井隆行, 安啓一, 後藤崇公
    日本音響学会研究発表会講演論文集 429-430 2005年9月  
  • 藤樫佑樹, 古賀綾子, 荒井隆行, 金寺登, 吉井順子
    日本音響学会研究発表会講演論文集 33-34 2005年9月  
  • 荒井隆行
    Interspeech Lisboa 2005 : 9th European Conference on Speech Communication and Technology, September 4-8, 2005. 2769-2772 2005年9月  
  • 荒井隆行
    Interspeech Lisboa 2005 : 9th European Conference on Speech Communication and Technology, September 4-8, 2005. 2025-2028 2005年9月  
  • 荒井隆行
    Interspeech Lisboa 2005 : 9th European Conference on Speech Communication and Technology, September 4-8, 2005. 1741-1744 2005年9月  
  • 荒井隆行
    Interspeech Lisboa 2005 : 9th European Conference on Speech Communication and Technology, September 4-8, 2005. 1033-1036 2005年9月  
  • Stevens Kenneth N, 荒井 隆行
    日本音響学会誌 61(9) 524-531 2005年9月  
  • 荒井隆行
    日本音響学会聴覚研究会資料 35(4) 237-242 2005年5月  
  • 林奈帆子, 程島奈緒, 井上豪, 後藤崇公, 田所史礼, 宮内裕介, 荒井隆行, 栗栖清浩
    日本音響学会研究発表会講演論文集 2005(1) 537-538 2005年3月  

MISC

 80

講演・口頭発表等

 185

Works(作品等)

 11

共同研究・競争的資金等の研究課題

 36

学術貢献活動

 1

社会貢献活動

 1

その他

 55
  • 2006年4月 - 2008年6月
    英語によるプレゼンテーションを学ぶ講義の中で、自分のプレゼンテーションを客観的に学生に示すため、発表風景をビデオに収め、後で学生にそれを見て自己評価させるようにしている。また、同内容で2回目のプレゼンテーションを行わせ、改善する努力を促す工夫もしている。
  • 2003年 - 2008年6月
    音響教育に関する委員会の委員を務め、教育セッション(例えば2006年12月に行われた日米音響学会ジョイント会議における教育セッション)をオーガナイズするなど。
  • 2003年 - 2008年6月
    音響教育に関する委員会の委員を務め、教育セッション(例えば2004年4月に行われた国際音響学会議における教育セッション)をオーガナイズするなど。特に2005年からは委員長を仰せつかり、精力的に活動している(例えば、2006年10月に国立博物館にて科学教室を開催)。
  • 2002年4月 - 2008年6月
    本学に赴任して以来、「Progress Report」と称して研究室の教育研究活動に関する報告書を作成し発行している。これにより、研究室の学生の意識の向上にも役立ち、効果を発揮している。
  • 2002年4月 - 2008年6月
    普段から英語に慣れておくことが重要であると考え、研究室の定例ミーティングの中で定期的に英語によるミーティングを行っている。また、2006年度からは研究グループごとに行われる毎回の進捗報告も英語で行うことを義務付けている。