Curriculum Vitaes

Arai Takayuki

  (荒井 隆行)

Profile Information

Affiliation
Professor, Faculty of Science and Technology, Department of Information and Communication Sciences, Sophia University
Degree
工学士(上智大学)
工学修士(上智大学)
博士(工学)(上智大学)

Contact information
araisophia.ac.jp
Researcher number
80266072
J-GLOBAL ID
200901064275514612
researchmap Member ID
1000260131

Research and professional experience:

2008-present Professor at the Department of Information and Communication Sciences,
Sophia University
2006-2008 Professor at the Department of Electrical
and Electronics Engineering, Sophia University
2003-2004 Visiting Scientist at the Research Lab. of Electronics,
Massachusetts Institute of Technology (Cambridge, MA, USA)
2000-2006 Associate Professor at the Department of Electrical
and Electronics Engineering, Sophia University
1998-2000 Assistant Professor at the Department of Electrical
and Electronics Engineering, Sophia University
1997-1998 Research Fellow at the International Computer Science Institute
/ University of California at Berkeley
(Berkeley, California, USA)
1995-1996 Visiting Scientist at the Department of Electrical Engineering,
Oregon Graduate Institute of Science and Technology
(Portland, Oregon, USA)
1994-1995 Research Associate at the Department of Electrical and
Electronics Engineering, Sophia University
working with Professor Yoshida
1992-1993 Visiting Scientist at the Department of Computer Science
and Engineering, Oregon Graduate Institute of Science and Technology
(Portland, Oregon, USA)

Short-term Visiting Scientist:

2000, August / 2001, August / 2002, August
Massachusetts Institute of Technology (Cambridge, Massachusetts, USA)
2001, March
Max Planck Institute for Psycholinguistics (Nijmegen, the Netherlands)

The series of events involved in speech communication is called “Speech Chain,” and it is a basic concept in the speech and hearing sciences. We focus on research related to speech communication. The fields of this research are wide-ranging, and our interests include the following interdisciplinary areas:
- education in acoustics (e.g., physical models of human vocal tract),
- acoustic phonetics,
- speech and hearing sciences,
- speech production,
- speech analysis and speech synthesis,
- speech signal processing (e.g., speech enhancement),
- speech / language recognition and spoken language processing,
- speech perception and psychoacoustics,
- acoustics for speech disorders,
- speech processing for hearing impaired,
- speaker characteristics in speech, and
- real-time signal processing using DSP processors.

(Subject of research)
General Acoustics and Education in Acoustics (including vocal-tract models)
Acoustic Phonetics, Applied Linguistics
Speech Science (including speech production), Hearing Science (including speech perception), Cognitive Science
Speech Intelligibility, Speech Processing, Speech Emhancement
Assistive Technology related to Acoustics, Speech and Acoustics for Everybody
Speech Processing, Applications related to Acoustics
Speaker Characteristics of Speech

(Proposed theme of joint or funded research)
acoustic signal processing
speech signal processing
auditory signal processing


Research History

 2

Papers

 601

Misc.

 80
  • 加島慎平, 飯田朱美, 安啓一, 荒井隆行, 菅原勉
    日本音響学会研究発表会講演論文集(CD-ROM), 2006 1-Q-7, Sep 6, 2006  
  • KOMATSU Masahiko, ARAI Takayuki
    IEICE technical report, 105(685) 121-126, Mar 27, 2006  
    Rhythm can be viewed in two different ways. Acoustic approach views rhythm as the alternating pattern of high and low intensity, which is regarded as a syllable. Phonemic approach attributes rhythm to the phonemic complexity of syllable structure and calculates rhythm based on the durations of consonant and vowel intervals. This paper investigates how well the acoustic approach fits to the phonemic approach. It tests two algorithms adopted in our previous studies, which estimate syllable centers from intensity contours based on the calculation of RMS and correlation with a cosine curve. The results are evaluated against the criteria of the phonemic approach. It concludes that the algorithms are valid, hence syllable shapes can be captured from the intensity contour.
  • AMINO Kanae, SUGAWARA Tsutomu, ARAI Takayuki
    IEICE technical report, 105(685) 109-114, Mar 27, 2006  
    In speaker identification by listening, the identification rates vary depending on the speech contents presented to the subjects. It is reported that the nasals are more effective than the oral sounds for identifying speakers. The present study investigates the availability of the nasal sounds in terms of syllable structures. The results showed that the coda nasals are highly effective, though onset consonants are also important. As to the place of articulation, alveolar consonants in onset positions were more effective than bilabials, and the nasals were better than their oral counterparts were.
  • AMINO Kanae, SUGAWARA Tsutomu, ARAI Takayuki
    聴覚研究会資料 = Proceedings of the auditory research meeting, 36(1) 109-114, Mar 27, 2006  
  • 安 啓一, 荒井 隆行, 進藤 美津子
    聴覚研究会資料 = Proceedings of the auditory research meeting, 36(1) 127-130, Mar 27, 2006  
  • 平井 沢子, 安 啓一, 荒井 隆行
    聴覚研究会資料 = Proceedings of the auditory research meeting, 36(1) 143-148, Mar 27, 2006  
  • KAJIMA Shimpei, TAKESHITA Ori, YASU Keiichi, ARAI Takayuki, IIDA Akemi
    IEICE technical report. Speech, 105(685) 49-53, Mar 20, 2006  
    This study aimed to reduce ventilator noise in the speech signals of patients who use a ventilator. Two applications, spectral subtraction and adaptive filtering, were examined. For the first experiment, we employed a new technique based on spectral subtraction for ventilator noise reduction and then evaluated the result using mean opinion score. The results showed that the new technique was superior to the common approach of spectral subtraction in reducing ventilator noise. For the second experiment, an adaptive filter was used in a simulated environment to reduce ventilator noise and thr...
  • KOBAYASHI Kei, HATTA Yukari, YASU Keiichi, MINAMIHATA Shinji, HODOSHIMA Nao, ARAI Takayuki, SHINDO Mitsuko
    IEICE technical report. Speech, 105(685) 31-36, Mar 20, 2006  
    Many individuals experience some degree of hearing loss as they age. In previous studies, Arai et al. (2001, 2002) reported that steady-state suppression of speech improves speech intelligibility in reverberant environments. Steady-state portions are defined as those having more energy, but which are less crucial for speech perception. Kobayashi et al. (2005) confirmed the possibility of consonant enhancement for improved intelligibility when using a hearing aid and the results indicated that intelligibility of a monosyllable was improved with significant difference for 50 elderly listeners...
  • MINAMIHATA Shinji, YASU Keiichi, KOBAYASHI Kei, ARAI Takayuki, SHINDO Mitsuko
    IEICE technical report. Speech, 105(685) 55-60, Mar 20, 2006  
    Elevation of the threshold of audibility occurs in hearing-impaired people, and these individuals have an expanded auditory filter (Glasberg and Moore, 1986). Threshold elevation is assumed to occur due to an increase in frequency components that pass the auditory filter; an assumption known as the "power spectrum model" of masking (Patterson and Moore, 1986). Therefore, we attempted here to remove from the speech signal the frequency components that are not related to speech perception, but are instead related to threshold elevation. We calculated the masking pattern using the spreading fu...
  • TAKAHASHI Kei, GOTO Takahito, TADOKORO Fumihiro, YASU Keiichi, ARAI Takayuki
    IEICE technical report. Speech, 105(685) 25-30, Mar 20, 2006  
    In an actual hall, reverberation degrades speech intelligibility, which is the result of overlap-masking occurred when segments of an acoustic signal are affected by reverberation components of previous segments. Arai et al. (2001, 2002) have been proposed a pre-processing technique which suppresses steady-state portion of speech in order to prevent the result of overlap-masking. However, the originally proposed technique is not suitable for real-time processing. Therefore, Arai et al. (2003) suggested alternative technique based on the First Fourier Transform (FFT) with cepstral analysis i...
  • 安啓一, 小林敬, 荒井隆行, 八田ゆかり, 南畑伸至, 進藤美津子
    日本音響学会研究発表会講演論文集(CD-ROM), 2006 3-P-19, Mar 7, 2006  
  • 網野 加苗, 菅原 勉, 荒井 隆行
    聴覚研究会資料 = Proceedings of the auditory research meeting, 35(2) 91-96, Mar 3, 2005  
  • 平井 沢子, 安 啓一, 荒井 隆行
    聴覚研究会資料 = Proceedings of the auditory research meeting, 35(2) 115-120, Mar 3, 2005  
  • F. Murai, T. Arai, T. Kimura
    Technical Report of IEICE Japan, SP2004-171 41-46, 2005  
  • S. Hirai, K. Yasu, T. Arai, K. Iitaka
    Technical Report of IEICE Japan, SP2004-168(696) 25-30, 2005  
    In a series of studies for English native speakers, Nittrouer et al. (1987) and others have reported that adults are sensitive to fricative noise for the acoustic cues relevant to fricative identity, while, children are sensitive to vocalic transition. The difference between children and adults decreased as children increased in age. This study examined Nittrouer's findings for Japanese native adults. Subjects identified tokens from a /∫/-/s/ continuum followed by vocalic portions with formant transitions changing continuously from ones appropriate for /∫/ to those for /s/. The results showed that Japanese native adults were also more sensitive to fricative noise than to formant transition.
  • K. Amino, T. Sugawara, T. Arai
    Technical Report of IEICE Japan, SP2004-164 1-6, 2005  
  • K. Kobayashi, Y. Hatta, K. Yasu, N. Hodoshima, T. Arai, M. Shindo
    Technical Report of IEICE Japan, SP2004-155 7-12, 2005  
  • K. Yasu, Y. Miyauchi, N. Hodoshima, N. Hayashi, T. Inoue, T. Arai, M. Shindo
    Technical Report of IEICE Japan, SP2004-154(695) 1-6, 2005  
    In reverberant environments, speech intelligibility is reduced and it is difficult for people to perceive speech. This is due to overlap-masking by reverberation components of previous segments (Bolt et al., 1949). For elderly people and hearing-impaired people, it is more critical issue to perceive and to understand speech in reverberant environments (Fitzgibbons and Gordon-Salant, 1999). There are two general approaches for improving speech intelligibility in reverberant environments : pre-processing and post-processing. To obviate the deterioration of speech intelligibility, Arai et al. suppressed steady-state portions of speech (steady-state suppression) that contained more energy compared with transitions but which were less crucial for speech perception, and confirmed a promising result for improving speech intelligibility (Arai et al., 2002). In previous studies, the effect of steady-state suppression was evaluated by measuring speech intelligibility of original and steady-state suppressed speech in simulated reverberation environments by convoluting speech with an impulse response in an auditorium at a sound proof room (Hodoshima et al., 2003, 2004). The results of the experiments with young normal hearing people showed that there were significant improvements in speech intelligibility using steady-state suppression in reverberant times between 0.7 s and 1.2 s. In this research, we conducted an experiment for evaluating steady-state suppression using reverberation times of 1.0 s and 1.3 s with fifty elderly people. We found that there were significant improvements in both reverberant conditions. Also, steady-state suppression yielded more improvement in intelligibility for a presbycusis group than a normal hearing group.
  • T. Arai
    J. Acoust. Soc. Am., 118(3) 1862-1862, 2005  
  • 小松 雅彦, 荒井 隆行, 菅原 勉
    音声研究, 7(3) 114-114, Dec 30, 2003  
  • ARAI Takayuki, GREENBERG Steven
    IEICE technical report. Speech, 103(155) 1-2, Jun 27, 2003  
  • HODOSHIMA Nao, ARAI Takayuki, INOUE Tsuyoshi, KINOSHITA Keisuke, KUSUMOTO Akiko
    IEICE technical report. Speech, 103(155) 61-65, Jun 27, 2003  
    One of the reasons reverberation degrades speech intelligibility is the effect of overlap-masking, when segments of an acoustic signal are masked by reverberation components of previous segments (Bolt et al., 1949). To reduce overlap-masking, Arai et al. suppressed steady-state portions having more energy, but which are less crucial for speech perception, and confirmed promising results for improving speech intelligibility (Arai et al., 2002). We conducted a perceptual test with steady-state suppression under a reverberation time of 0.4s-1.0s. The results showed significant improvements for some reverberation conditions and especially for stop consonants. We certified that steady-state suppression is an effective pre-processing method for improving speech intelligibility under reverberant conditions and proved the effect of overlap-masking.
  • GREENBERG Steven, ARAI Takayuki
    IEICE technical report. Speech, 103(155) 27-36, Jun 27, 2003  
    Classical models of speech recognition assume that a detailed, short-term analysis of the acoustic signal is essential for accurately decoding the speech signal and that this decoding process is rooted in the phonetic segment. This paper presents an alternative view, one in which the time scales required to accurately describe and model spoken language are both shorter and longer than the phonetic segment, and are inherently wedded to the syllable. The syllable reflects a singular property of the acoustic signal - the mod - ulation spectrum - which provides a principled, quantitative framework to describe the process by which the listener proceeds from sound to meaning. The ability to understand spoken language (i.e., intelligibility) vitally depends on the integrity of the modulation spectrum within the core range of the syllable (3-10 Hz) and reflects the variation in syllable emphasis associated with the concept of prosodic prominence ("accent"). A model of spoken language is described in which the prosodic properties of the speech signal are embedded in the temporal dynamics associated with the syllable, a unit serving as the organizational interface among the various tiers of linguistic representation.
  • 安啓一, 菱谷誠人, 荒井隆行, 村原雄二, 篠原拡士
    日本音響学会研究発表会講演論文集, 2002 379-380, Sep 26, 2002  
  • M. Hishitani, K. Kobayashi, K. Shinohara, K. Yasu, T. Arai
    Handbook of the International Hearing Aid Research Conference (IHCON), 64-65, 2002  
  • M. Komatsu, T. Arai
    Meeting Handbook of the Linguistic Association of Canada and the United States (LACUS) Forum, 39-39, 2002  
  • Y. Kaneko, T. Sugawara, T. Arai, K. Okazaki, K. Iitaka
    Joint Conf. of the IX International Congress for the Study of Child Language and the Symposium on Research in Child Disorders, 145-145, 2002  

Presentations

 185

Works

 11

Research Projects

 36

Academic Activities

 1

Social Activities

 1

Other

 55
  • Apr, 2006 - Jun, 2008
    英語によるプレゼンテーションを学ぶ講義の中で、自分のプレゼンテーションを客観的に学生に示すため、発表風景をビデオに収め、後で学生にそれを見て自己評価させるようにしている。また、同内容で2回目のプレゼンテーションを行わせ、改善する努力を促す工夫もしている。
  • 2003 - Jun, 2008
    音響教育に関する委員会の委員を務め、教育セッション(例えば2006年12月に行われた日米音響学会ジョイント会議における教育セッション)をオーガナイズするなど。
  • 2003 - Jun, 2008
    音響教育に関する委員会の委員を務め、教育セッション(例えば2004年4月に行われた国際音響学会議における教育セッション)をオーガナイズするなど。特に2005年からは委員長を仰せつかり、精力的に活動している(例えば、2006年10月に国立博物館にて科学教室を開催)。
  • Apr, 2002 - Jun, 2008
    本学に赴任して以来、「Progress Report」と称して研究室の教育研究活動に関する報告書を作成し発行している。これにより、研究室の学生の意識の向上にも役立ち、効果を発揮している。
  • Apr, 2002 - Jun, 2008
    普段から英語に慣れておくことが重要であると考え、研究室の定例ミーティングの中で定期的に英語によるミーティングを行っている。また、2006年度からは研究グループごとに行われる毎回の進捗報告も英語で行うことを義務付けている。