The maker : For who’s pleasure do we train the models?
Who is the curator/ the audience?
What is the purpose?
“If we employ machine intelligence to augment our writing activities, it’s worth asking how such technology would affect how we think about writing as well as how we think in the general sense.” (Ross Godwin )
How to start
- recommendation from R. Fiebrink about sound and triggering the installation
Use sound recognition to find if there is someone talking to one of the artefact, will solve the problem of how many people are interacting effectively with the installation, play with the sensitivity level , to get the data sound we need
- Build up one “ear” node, with microphone , check this and that to understand wich type of microphone we need. > lavalier microphone
- Build up a “mouth” to display the response of the chatbot installation, play with idea that a microphone is a reverse speaker?
- version 1 simple version via p5js with the need to have access to cloud to do speech recognition ,
- version 2 for speech recognition with BitVoicer API , speech recognition, check Arduino Tuto : speech recognition and synthesis,
- would need to use a private router to secure access to the cloud for the text to speech > check authorisation for plugging the private router to internet,
To generate responses from the ChatBot : build up different training set with different methods : Markov Chain , char RNN review and basic char RNN training with github source, , Tensor Flow‘s sequence to sequence library feed ,
write different scripts via RiverScript to compute the input text to ouput text, use array in Riverscript? generate different characters , who you are talking to, use a list of trigger words in an array to output a certain sentence
or use the mic from MAC
- after tutorial with H. Pritchard
produce a mix reality with the electronic node devices and human interpretation by the nodes Node A activate node B/ one sensor, one motor, one movement, / small (or big)size nodes
create the narrative: when (no one/ or not enough people) is (looking or talking) , the nodes are interacting with each other- when there is enough people or one person( looking or talking), the nodes stop moving, and talk back
Using text to speech and maybe speech recognition