• strict warning: Non-static method view::load() should not be called statically in /mnt/ebs_bolumena/WWW/berbitek/sites/all/modules/views/views.module on line 879.
  • strict warning: Declaration of views_handler_filter::options_validate() should be compatible with views_handler::options_validate($form, &$form_state) in /mnt/ebs_bolumena/WWW/berbitek/sites/all/modules/views/handlers/views_handler_filter.inc on line 0.
  • strict warning: Declaration of views_handler_filter::options_submit() should be compatible with views_handler::options_submit($form, &$form_state) in /mnt/ebs_bolumena/WWW/berbitek/sites/all/modules/views/handlers/views_handler_filter.inc on line 0.
  • strict warning: Declaration of views_plugin_style_default::options() should be compatible with views_object::options() in /mnt/ebs_bolumena/WWW/berbitek/sites/all/modules/views/plugins/views_plugin_style_default.inc on line 0.
  • strict warning: Declaration of views_plugin_row::options_validate() should be compatible with views_plugin::options_validate(&$form, &$form_state) in /mnt/ebs_bolumena/WWW/berbitek/sites/all/modules/views/plugins/views_plugin_row.inc on line 0.
  • strict warning: Declaration of views_plugin_row::options_submit() should be compatible with views_plugin::options_submit(&$form, &$form_state) in /mnt/ebs_bolumena/WWW/berbitek/sites/all/modules/views/plugins/views_plugin_row.inc on line 0.

Hizkuntz, Ahots eta Multimedia Teknologien Behatokia

http://www.dataversity.net/how-consumer-focused-artificial-intelligence-startups-are-breaking-down-language/
2018/05/02 - 19:12

by Angela Guess Larry Alton recently wrote for TechCrunch, “Language is a distinctly human trait. Or is it? A new wave of consumer-focused startups are now developing artificial intelligence programs that can mimic, adapt and interpret the complex semantic patterns that were previously unrecognizable to logical, mathematically driven machines. Evolutionarily, complex linguistic patterns are one…

The post How Consumer-Focused Artificial Intelligence Startups are Breaking Down Language appeared first on DATAVERSITY.

http://www.dataversity.net/brain-like-algorithm-enables-real-time-natural-language-processing/
2018/05/02 - 19:12

by Angela Guess A recent article out of the company reports, “Cortical.io, an innovator in Natural Language Processing (NLP), announces the availability of its Retina engine in the Microsoft Azure Marketplace. Based on an advanced proprietary machine intelligence algorithm, the Cortical.io Retina engine encompasses a wide range of highly efficient NLP tools for text filtering,…

The post Brain-Like Algorithm Enables Real Time Natural Language Processing appeared first on DATAVERSITY.

http://www.dataversity.net/the-reality-of-natural-language-processing/
2018/05/02 - 19:12

by Jelani Harper Natural Language Processing (NLP) may seem the least notable of Semantic technologies and their applications, particularly when considering the hype surrounding graph databases, Cognitive Computing, and the Internet of Things. Nonetheless, Markets and Markets reports that the NLP market is projected to grow to $13.4 billion--with a CAGR of 18.4 percent—in the next…

The post The Reality of Natural Language Processing appeared first on DATAVERSITY.

http://andonisagarna.blogspot.com/2015/09/googledocs-ek-euskaraz-diktatzen-zaiona.html
2018/05/02 - 19:12

Lehenik eta behin, esan dezadan probatu dudala eta nahiko ondo funtzionatzen duela. Probatu nahi baduzu hauxe egin behar duzu:

1. Chrome nabigatzailean GoogleDocseko (edo Google Driveko) testu-dokumentu bat sortu.




2. Fitxategia menuan Hizkuntza Euskara hautatu



3. Tresnak menuan hautatu ahots bidezko idazketa


4. Sakatu mikroaren ikonoa eta hasi diktatzen


http://andonisagarna.blogspot.com/2015/09/hizketa-askearen-analisi-automatikoaren.html
2018/05/02 - 19:12

Iturria: Nature
Psikiatrian ez dago beste espezialitate batzuetan erabili ohi diren proba kliniko objektiboen modukorik. Alabaina, IBMk, Columbia Unibertsitateko Zentro Medikoak eta Hego Amerikako ikertzaile batzuek garatu duten programa informatiko batek, hizketa automatikoki analizatuz, iragar dezake eskizofrenia-krisi bat arriskua duten gazteengan.

Eskizofrenia duten pertsonek hizketan asaldura arinak dituzte, are lehen aldiz psikosia garatu aurretik.  Programa informatiko horrek simulatzen du nola ulertzen duen giza garunak elkarrizketen trankripzioetan erabilitako hizkuntza.  34 pertsonarengan hizketako esaldi batetik hurrengora esanguraren jarioa moteltzea eta hizketaren konplexutasunaren markatzaile gramatikalak aztertuz lortu dute gero psikosia garatu zuten bost pertsonak identifikatzea.
Esperimentuaren helburua zen, berbaldien analisi automatikoa eta ikaskuntza automatikoa konbinatuz, agertuko ziren psikosi-krisiak aldez aurretik igartzea psikosi-arriskua duten gazteengan.

Esperimentuan parte hartu zuten gazteetatik 11 neskak ziren eta gainerakoak mutilak. Hiru hilean behin probak egin zizkieten bi urte eta erdian. Horietatik bostek izan zuten krisia.


http://andonisagarna.blogspot.com/2015/08/nola-itzultzen-ditu-google-translatek.html
2018/05/02 - 19:12

Iturria: Google Research Blog

Iaz Googlek funtzio interesgarri bat gehitu zion Google Translateri: telefonoaren edo tabletaren kamera erabiliz jasotako testu bateko irudia denbora errealean itzultzekoa. Hona nola lortzen den hori:

Lehenik eta behin, kameratik irudia iristen denean, Google Translateren aplikazioak irudiko letrak bilatu behar ditu. Horretarako, argazkian dauden letrez aparteko irudiak, direla loreak, leihoak edo beste edozer, baztertu behar ditu eta itzuli beharreko hitzak hartu. Antzeko kolorea duten pixel mordoak eta gainera antzeko beste pixel mordo batzuetatik hurbil daudenak bilatzen ditu.

Horiek letrak izan litezke eta, batzuk besteetatik gertu baldin badaude, irakurri beharko litzatekeen lerro jarraitu bat osatuko dute. Horren ondoren, Translatek bakoitza zer letra den igarri behar du. Hor ikaskuntza sakona aplikatzen da. "Konboluziozko sare neuronalak" erabiltzen dituzte.Konboluziozko sare neuronal bat sare neuronal artifiial bat da non neuronak garun biologiko baten ikusmen-kortexeko neuronen oso antzera baitagozkie hartze-eremuei. Ikusmen artifizialean erabiltzen dira, batez ere irudiak sailkatzeko. 

Sistema hori letrak direnak eta ez direnak bereizteko entrenatzen da. Horrela ikas dezake zer itxura duen letra bakoitzak. Gauza ez da hain sinplea, ordea, letra txukunak, garbiak, bakarrik ezagutzeko entrenatzen badugu, arriskua dugu errealitatean aurkitzen dituen letrak ez ezagutzeko. Benetan aurkitzen diren letrek errainuak, zikinkeria, arrastoak eta era guztietako akatsak izaten dituzte. Beraz, prestatu duten letra-sortzailea mundu errealeko era guztietako zikinkeriak imitatzeko gai da

Galde liteke zergatik ez ote duten entrenatu mundu errealeko letren argazkiak erabiliz. Arrazoia da zaila dela hizkuntza guztietako aski adibide aurkitzea eta are zailagoa zehazki kontrolatzea zer adibide erabiltzen diren sare neuronal eraginkor bat entrenatzeko. Hortaz, eraginkorragoa da zikinkeria simulatzea. Zikinkeria eta biraketak erabiltzen dira baina ez gehiegi, sare neuronala ez nahasteko.

Hirugarren urratsa da letra ezagunak hartu eta hiztegi batean bilatzea,  itzulpenak egiteko. Aurreko urrats guztiek era batera edo bestera huts egin zezaketenez, hiztegian egiten diren bilaketek gutxi gorabeherakoak izan behar dute. Horrela, esate baterako, "S" bat "5" bat dela interpretatu arren S-dun hitza zuzen ezagut daiteke.

Azkenik, itzulpena jatorrizko hitzen gainean eratzen da, jatorrizko estilo bera erabiliz. Letren inguruko kolorea erabiltzen da, jatorrizko letrak ezabatzeko. Horrela, itzulpena agerrarazten da, hondoak zuen jatorrizko kolorearen gainean.

Gero beste arazo bat gainditu behar da. Izan ere, emaitza ordenagailu ahaltsu batean erakutsi behar balitz, ez legoke arazorik baina telefono batean erakutsi behar da eta gainera baliteke konexioa ere motel samarra izatea. Horregatik, sare neuronalak oso txikia izan beharra dauka, eta muga zorrotzak ezarri behar zaizkio hari irakatsiko zaion datu multzoari, prozesatu behar duen informazio-dentsitatea ez dadin muga batetik gorakoa izan. 

Entrenatzeko erabiltzen diren datuak ahalik eta eraginkorrenak izan daitezen lortu behar da. Esate baterako, irakatsi nahi zaio pixka bat okerturik dauden letrak ezagutzen baina ez gehiegi okertuak. Letrak biraketa handi samarra balu sare neuronalak informazio-dentsitate gehiegi erabili beharko luke axola gutxiko gauzetan. Ahalegina egin behar da iterazio-denbora laburrak eta erakuste onak sortzen dituzten tresnak egiteko.
 
Minutu gutxiren barruan, alda ditzakete entrenatzeko datuak sortzeko algoritmoak, datuak sortu, berriz saiatu eta erakutsi. Era horretan ikus dezakete zer letra motak huts egiten duten eta zergatik. Denbora errealean gauza horiek lortu ahal izateko, sakonki optimizatu eta eskuz doitu dituzte eragiketa matematikoak. 

http://feedproxy.google.com/~r/blogspot/gJZg/~3/UaaY5sYW9Nw/crowdsourcing-text-to-speech-voice-for.html
2018/05/02 - 19:12

Posted by Linne Ha, Senior Program Manager, Google Research for Low Resource Languages

Building a decent text-to-speech (TTS) voice for any language can be challenging, but creating one – a good, intelligible one – for a low resource language can be downright impossible. By definition, working with low resource languages can feel like a losing proposition – from the get go, there is not enough audio data, and the data that exists may be questionable in quality. High quality audio data, and lots of it, is key to developing a high quality machine learning model. To make matters worse, most of the world’s oldest, richest spoken languages fall into this category. There are currently over 300 languages, each spoken by at least one million people, and most will be overlooked by technologists for various reasons. One important reason is that there is not enough data to conduct meaningful research and development.

Project Unison is an on-going Google research effort, in collaboration with the Speech team, to explore innovative approaches to building a TTS voice for low resource languages – quickly, inexpensively and efficiently. This blog post will be one of several to track progress of this experiment and to share our experience with the research community at large – our successes and failures in a trial and error, iterative approach – as our adventure plays out.

One of the most critical aspects of building a TTS system is acquiring audio data. The traditional way to do this is in a professional recording studio with a voice talent, sound engineer and a voice director. The process can take considerable time and can be quite expensive. People often mistake voice talent work to be similar to a news reader, but it is highly specialized and the work can be very difficult.

Such investments in time and money may yield great audio, but the catch is that even if you’ve created the best TTS voice from these recordings, at best it will still sound exactly like the voice talent - the person who provided the raw audio data. (We’ve read the articles about people who have fallen for their GPS voice to find that they are real people with real names.) So the interesting problem here from a research perspective is how to create a voice that sounds human but is not identifiable as a singular person.

Crowd-sourcing projects for automatic speech recognition (ASR) for Google Voice Search had been successful in the past, with public volunteers eager to participate by providing voice samples. For ASR, the objective is to collect from a diversity of speakers and environments, capturing varying regional accents. The polar opposite is true of TTS, where one unique speaker, with the standard accent and in a soundproof studio is the basic criteria.

Many years ago, Yannis Agiomyrgiannakis, Digital Signal Processing researcher on the TTS team in Google London, wrote a “manifesto” for acoustic data collection for 2000 languages. In his document, he gave technical specifications on how to convert an average room into a recording studio. Knot Pipatsrisawat, software engineer in Google Research for Low Resource Languages, built a tool that we call “ChitChat”, a portable recording studio, using Yannis’ specifications. This web app allows users to read the prompt, playback the recording and even assess the noise level of the room.

From other past research in ASR, we knew that the right tool could solve the crowd sourcing problem. ChitChat allowed us to experiment in different environments to get an idea of what kind of office space would work and what kind of problems we might encounter. After experimenting with several different laptops and tablets, we were able to find a computer that recognized the necessary peripherals (the microphone, USB converter, and preamp) for under $2,000 – much cheaper than a recording studio!

Now we needed multiple speakers of a single language. For us, it was a no-brainer to pilot Project Unison with Bangladeshi Googlers, all of whom are passionate about getting Google products to their home country (the success of Android products in Bangladesh is an example of this). Googlers by and large are passionate about their work and many offer their 20% time as a way to help, to improve or to experiment on something that may or may not work because they care. The Bangladeshi Googlers are no exception. They embodied our objectives for a crowdsourcing innovation: out of many, we could achieve (literally) one voice.

With multiple speakers, we would target speakers of similar vocal profiles and adapt them to create a blended voice. Statistical parametric synthesis is not new, but the advances in recent technology have improved quality and proved to be a lightweight solution for a project like ours.

In May of this year, we auditioned 15 Bangaldeshi Googlers in Mountain View. From these recordings, the broader Bangladeshi Google community voted blindly for their preferred voice. Zakaria Haque, software engineer in Machine Intelligence, was chosen as our reference for the Bangla voice. We then narrowed down the group to five speakers based on these criteria: Dhaka accent, male (to match Zakaria’s), similarity in pitch and tone, and availability for recordings. The original plan of a spectral analysis using PRAAT proved to be unnecessary with our limited pool of candidates.

All 5 software engineers – Ahmed Chowdury, Mohammad Hossain, Syeed Faiz, Md. Arifuzzaman Arif, Sabbir Yousuf Sanny – plus Zakaria Haque recorded over 3 days in the anechoic chamber, a makeshift sound-proofed room at the Mountain View campus just before Ramadan. HyunJeong Choe, who had helped with the Korean TTS recordings, directed our volunteers.

Left: TPM Mohammad Khan measures the distance from the speaker to the mic to keep the sound quality consistent across all speakers. Right: Analytical Linguist HyunJeong Choe coaches SWE Ahmed Chowdury on how to speak in a friendly, knowledgeable, "Googly" voice

ChitChat allowed us to troubleshoot on the fly as recordings could be monitored from another room using the admin panel. In total, we recorded 2000 Bangla and English phrases mined from Wikipedia. In 30-60 minute intervals, the participants recorded over 250 sentences each.

In this session, we discovered an issue: a sudden drop in amplitude at high frequencies in a few recordings. We were worried that all the recordings might have to be scrapped.

As illustrated in the third image, speaker3 has a drop in energy above 13kHz which is visible in the graph and may be present at speech, distorting the speaker’s voice to sound as if he were speaking through a tube.

Another challenge was that we didn’t have a pronunciation lexicon for Bangla as spoken in Bangladesh. We worked initially with the publicly available TTS data from the Indian Institute of Information Technology, but this represented the variant of Bangla spoken in West Bengal (India), which differs from the speech we recorded. Our internally designed pronunciation rules for Bengali were also aimed at West Bengal and would need to be revised later.

Deciding to proceed anyway, Alexander Gutkin, Speech software engineer and lead for TTS for Low Resource Languages in Google London, built an initial prototype voice. Using the preliminary text normalization rules created by Richard Sproat, Speech and Language Processing researcher, the first voice we attempted proved to be surprisingly good. The problem in the high frequencies we had seen in the recordings is undetectable in the parametric voice.

When we return to the sound studio to record an additional 200 longer sentences, we plan to try an upgrade of the USB converter. Meanwhile, Martin Jansche, Natural Language Understanding software engineer, has worked with a team of native speakers on a pronunciation and lexicon and model that better matches the phonology of colloquial Bangladeshi Bangla. Alexander will use the additional recordings and the new pronunciation dictionary to build the second version.

NEXT UP: Building a parametric voice with multiple speaker data (Ep.2)

https://lingpipe-blog.com/2014/03/08/lucene-4-essentials-for-text-search-and-indexing/
2018/05/02 - 19:12

Here’s a short-ish introduction to the Lucene search engine which shows you how to use the current API to develop search over a collection of texts. Most of this post is excerpted from Text Processing in Java, Chapter 7, Text Search with Lucene. Lucene Overview Apache Lucene is a search library written in Java. It’s […]

http://feedproxy.google.com/~r/StreamHacker/~3/8qP4-yCzsxU/
2018/05/02 - 19:12

This is a short story about the text-processing.com API, and how it became a profitable side-project, thanks to Mashape.

Text-Processing API

When I first created text-processing.com, in the summer of 2010, my initial intention was to provide an online demo of NLTK’s capabilities. I trained a bunch of models on various NLTK corpora using nltk-trainer, then started making some simple Django forms to display the results. But as I was doing this, I realized I could fairly easily create an API based on these models. Instead of rendering HTML, I could just return the results as JSON.

I wasn’t sure if anyone would actually use the API, but I knew the best way to find out was to just put it out there. So I did, initially making it completely open, with a rate limit of 1000 calls per day per IP address. I figured at the very least, I might get some PHP or Ruby users that wanted the power of NLTK without having to interface with Python. Within a month, people were regularly exceeding that limit, and I quietly increased it to 5000 calls/day, while I started searching for the simplest way to monetize the API. I didn’t like what I found.

Monetizing APIs

Before Mashape, your options for monetizing APIs were either building a custom solution for authentication, billing, and tracking, or pay thousands of dollars a month for an “enterprise” solution from Mashery or Apigee. While I have no doubt Mashery & Apigee provide quality services, they are not in the price range for most developers. And building a custom solution is far more work than I wanted to put into it. Even now, when companies like Stripe exist to make billing easier, you’d still have to do authentication & call tracking. But Stripe didn’t exist 2 years ago, and the best billing option I could find was Paypal, whose API documentation is great at inducing headaches. Lucky for me, Mashape was just opening up for beta testing, and appeared to be in the process of solving all of my problems </p />
</p></div>
  </div>
  
  <div class= Share this

http://andonisagarna.blogspot.com/2015/03/bbck-bilbon-probatu-zituen-itzulpen.html
2015/03/26 - 14:03

Iturria: BBC

BBCko kazetari batek itzulpen-aplikazio batzuk hartu eta Bilbora jo zuen haiek probatzera.

Probak era honetakoak izan ziren:

  • Guggenheim museora joan eta jakitea zein den asegururik altuena duen artelana
  • Moiuako metro geltokian norbaiti galdetzea zein den biderik egokiena Plaza Barrira joateko
  • Plaza Barrian bere lehen musuaren istorioa kontatuko zion norbait bilatzea
  • Gili-Gili arropa-denda bilatu eta barruko norbaiti eskatzea berarekin eta han saltzen den zerbaitekin selfie bat egiteko.
  • Taxi bat hartu, Concha Kafetegira joateko, eta, hara iritsitakoan, galdetzea zein den gehiena saltzen duten pintxoa
  • Zezenketa-museora joan eta hango langileei galdetzea zenbat pertsona har dezakeen zezen-plazak, betetzen denean, eta noiz eraitsi zen lehengoa
  • Postal bat eta euro bateko zigilua erosi eta Teknologia sailekoei bidaltzea

Google Translate ez dagoenez ahozko euskararako prestatua, kazetariak ingelesetik espainierarako itzulpena aukeratu zuela dio.

Baldintzak ia ezin hobeak izan ziren arren, hau da, lokalen barruan ez egon arren hondoko zaratarik eta proban parte hartzeko prest agertu zirenek denbora errealeko itzulpen-aplikazioak ezagutzen bazituzten ere, hasieran zailtasunak izan zituen esaldi aski sinpleak maneiatzeko.
Hainbat arazo azaldu ziren proba horietan. Kazetariak dio aplikazio horiek berde samar daudela oraindik. Joseba Abaitua Deustuko Unibertsitateko irakasleak esan zioen hizketa ezagutzeko teknologiak dituen mugak ikusten direla hor. Aurrez aurre duzun norbaitekin telefono baten bidez itzulpenak eginez elkarrizketan jardutea beti izango dela trakets samarra, baina aplikazioak gero eta hobeto funtzionatuko dutela jendeak erabili ahala.
 
Kontuan izan behar da makina bati hitz egiten zaiola. Esaldiak laburrak eta garbi esanak baldin badira sistemak ondo funtzionatuko du baina hizketa zaindugabea eginez gero huts egingo du. Sistema horiek asko hobetu beharra dute oraindik azentu eta hitz egiteko modu desberdinak ezagutzeko.

Aplikazio guztiak ez dira kalitate berekoak ere.

Sindikatu edukia