About nishimotz

A freelance consultant. doctor of engineering. speech interface, open-source software, accessibility, #nvdajp. Facebook: http://bit.ly/ckUk20

NVDA development in Japan

NVDA (Non-Visual Desktop Access) is developed by an NPO based
in Australia.

I am participating as the project leader of Japanese localization of NVDA since 2011.

Assisitive technlogy is making evolutions, as well as the other technologies.

In Japan, most of the core technologies, such as text input method, speech synthesis, and automatic braille translation, have not been freely available, so the Japanese community is behind the times.
This is the reason I committed to the project so deeply since the earthquake.

nvdajp 2011.3j was released on Dec 23.
Its enhancement includes Japanese TTS, IME support and braille support.
Default speech engine is JTalk, an open-source Japanese TTS, based on Open JTalk.

Functions of reporting Japanese input method has limitations as follows:

(1) IME support does not work properly with Wordpad, Microsoft Word, and Internet Explorer.

(2) With Internet Explorer and some applications, after writing some Japanese characters within a textbox,
pressing Escape key without finishing composition may cause collision of escaping edit mode with canceling Japanese composition.

(3) With 32bit applications, key-echo of Japanese input is delayed, because the report is performed at the key-up events.

IME support function can be disabled using Keyboard setting.

DirectBM is experimental braille driver which supports KGS Japanese braille displays.

Although the activity is separated from the mainstream development now, international cooperation, such as helping development of Chinese text imput support, is very important.

We started discussions with an NPO in Japan to obtain the support. I am also seeking the consultation business related to the software, or a position of full-time researcher where I can keep the engagement of NVDA development. I hope I can inform the good news this year.

Hanoi, Vietnam

Four-day visit to Hanoi, Vietnam.

Golden Sun Legend Hotel. The room was sound-proofing and confortable.
The hotel manager arranged airport transportations and Halong bay one-day trip (including Kayaking). Currency exchange was also available there. Room PC was Windows 7 Ultimate. WiFi service for guests.


Continue reading

Research on Effective Designs and Evaluation for Speech Interface Systems

This post is excerpt from the draft version of the abstract of doctoral dissertation.
My public hearing for the dissertation will be held this month at Waseda University.
Although the thesis itself is written in Japanese, I am willing to write the related topics here in English.

This paper describes a systematic way of enabling of developers and designers to build information-communication systems successfully with speech technologies, such as speech synthesis and speech recognition. As the results of this work, application systems of speech technologies can be used easily for everyone.

This work also describes four research projects including the development of speech applications and the evaluations of speech interfaces, which are performed based on the proposed methodology.

Continue reading

ICCHP 2010 talk

My talk on audio CAPTCHA at ICCHP finished yesterday in Vienna.

I enjoyed using Twitter during the conference. Thank you.

Dear new friends,
Usually I tweet in Japanese with @nishimotz account. I prefer using facebook for English conversation. So, please feel free to unfollow me.
Both facebook and @nishimotz account on Twitter can be used for English conversation.
I will tweet in Japanese with @nishimtz account.

My slide and my tweets at ICCHP is as follows: Takuya Nishimoto, Takayuki Watanabe: The Evaluations of Deletion-Based Method and Mixing-Mased Method for Audio CAPTCHAs.

Notice (2010-10-16): Updated slide of this topic (for Interspeech in Sep. 2010) is available.

Continue reading

DMCPP: development of another dialog manager

An experimental dialog manager of Galatea for Linux, using lib-julius and OpenCV, written in C++, is under development.
The project is focusing on the low-level multimodal event handling, while Galatea Dialog Studio is focusing on higher-level dialog management using VoiceXML.
Although current version is just the skeleton of applications, I would like to ask for feedback from the developers.

DMCPP in English
DMCPP in Japanese

Please join mailing lists for discussions.

pyAA

To test the MSAA-related features of Microsoft Japanese Input Method Editor MS-IME 2002 for Japanese version of Windows, I am working with pyAA.
This is a preliminary work for localization of NVDA for Japanese users.

I tried to adopt original pyAA to Python 2.6.x.
At first I obtained the source code (of simpler branch) from CVS repository, then I built it with Visual Studio 2008 and SWIG. I also modified the code so that the Value-property can be accessed correctly under multibyte charactor coding environments.

We are successfully proceeding the work with it at the moment.
At the previous meeting of NVDAjp project, we added some code to NVDA, and verified that the WinEvent of MS-IME 2002 can be captured and the Value property can be accessed using IAccessible interface.

Related pages in Japanese (not yet translated) : pyaa and nvdajp

Notes on Feb 27 : I created a github repository of pyaa.

Voice interface and effectiveness

One of my colleague made a presentation at Human-Agent-Interaction symposium in Tokyo yesterday.

The assumption is that the human-like spoken dialogs are highly effective. Our proposal is to use the reinforcement-learning for acquiring the strategy how to respond quickly to overlapped utterances, interruptions, or gestures during spoken dialogs between human and machine. Although the research is still in early stage, we hope something like mind-reading will be possible, in other words, the users of spoken dialog systems do not need to say from the beginning to the end.
Continue reading

orpheus_tw

I am developing a service called orpheus_tw.
Japanese songs composed by the automatic composition system “Orpheus” (a research project at the University of Tokyo) can be shared with the followers of a Twitter account @orpheus_tw.

This service was built with Ruby and Rails, and hosted by Heroku. Additional “delayed job” option is also used.

Research on Spoken Dialogue Agent

The upcoming publications at Human Agent Interaction Symposium (HAI2009) are as follows:

  • Masayuki Nakazawa, Takuya Nishimoto, Shigeki Sagayama:
    Title: Behavior Generation for Spoken Dialogue Agent by Dynamical Model
    Abstract: For the spoken dialog systems with the anthropomorphic agents, it is important to give the natural impressions and the real presence to human. For this purpose, the head and gaze controls of the agent which are consistent with the spoken dialogs are expected to be effective. Our approach is based on the following hypotheses: 1) An agent performs the dialog concurrently with the intentional controls of the head and gaze to retrieve the information and to give signals. 2) The movement of the head and eyeballs is based on mathematical models. To achieve these purpose, we have adopt the mathematical model for movements of the agent.
    There are several merits to formulate by the mathematical model, a) the parameters can reflect the subjectivity which can generate various movement from this model, b) the movements of the agents can reflect the personality, c) the continuous movements of the agent can be controlled by the mathematics. In this paper, we propose a mathematics model by the second order system and perform comparison with the linear model and show the superiority.
  • Di Lu, Masayuki Nakazawa, Takuya Nishimoto, Shigeki Sagayama:
    Title: Barge-in Control with Reinforcement Learning for Efficient Multi-modal Spoken Dialogue Agent
    Abstract: To make the dialogue between the agent and the user smoother, we propose a multi-modal user simulator that could be widely used in real-time agent control for multi-modal dialog agent with reinforcement learning. We also implemented the prototype system that utilized the result of reinforcement learning.

Date: Fri, Dec 4 – Sat, Dec 5, 2009

Place: Tokyo Institute of Technology

Language: Japanese

Date: Thu, Oct 29 – Fri, Oct 30, 2009

Place: ASPAM (Aomori City, Japan)

Language: Japanese