+ working code with console and signal backend

This commit is contained in:
nicobo 2020-05-12 07:46:27 +02:00
parent f158af1fc6
commit bfe3cda1e0
No known key found for this signature in database
GPG key ID: 2581E71C5FA5285F
14 changed files with 1173 additions and 2 deletions

4
.gitignore vendored
View file

@ -127,3 +127,7 @@ dmypy.json
# Pyre type checker # Pyre type checker
.pyre/ .pyre/
# All local-only / private files
**/local*
**/priv*

126
README.md
View file

@ -1,2 +1,124 @@
# nicobot-jabber # nicobot
A chat bot that can ask something over XMPP / Jabber and wait for an answer
🤟 A collection of *cool* chat bots 🤟
It features :
- Participating in [Signal](https://www.signal.org/fr/) conversations
- Using [IBM Watson™ Language Translator](https://cloud.ibm.com/apidocs/language-translator) cloud API
## Requirements & installation
Requires :
- Python 3
- [signal-cli](https://github.com/AsamK/signal-cli) (for the *Signal* backend)
- An IBM Cloud account ([free account ok](https://www.ibm.com/cloud/free))
Install Python dependencies with :
pip3 install -r requirements.txt
See below for Signal requirements.
## Transbot
*Transbot* is a chatbot interface to IBM Watson™ Language Translator service that translates messages.
Whenever it sees a keyword in a conversation, *transbot* will translate the whole message into a random language.
### Quick start
1. [Create a *Language Translator* service instance on IBM Cloud](https://cloud.ibm.com/catalog/services/language-translator) and [get the URL and API key from your console](https://cloud.ibm.com/resources?groups=resource-instance)
2. Fill them into `test/sample-conf/config.yml` (`ibmcloud_url` and `ibmcloud_apikey`)
3. Run `python3 nicobot/transbot.py -C test/sample-conf`
4. Input `Hello world` in the console : the bot will print a random translation of "Hello World"
5. Input `Bye nicobot` : the bot will terminate
If you want to send & receive messages through *Signal* instead of reading from the keyboard & printing to the console :
1. Install and configure `signal-cli` (see below for details)
2. Run `python3 nicobot/transbot.py -C test/sample-conf -b signal -u '+33123456789' -r '+34987654321'` with `-u +33123456789` your *Signal* number and `-r +33987654321` the one of the person you want to make the bot chat with
See below for more options...
### Main configuration options and files
Run `transbot.py -h` to get a description of all options.
A sample configuration is available in the `test/sample-conf/` directory.
Below are the most important configuration options :
- **--config-file** and **--config-dir** let you change the default configuration directory and file. All configuration files will be looked up from this directory ; `--config-file` allows overriding the location of `config.yml`.
- **--keyword** and **--keywords-file** will help you generate the list of keywords that will trigger the bot. To do this, run `transbot.py --keyword <a_keyword> --keyword <another_keyword> ...` a **first time with** : this will download all known translations for these keywords and save them into a `keywords.json` file. Next time you run the bot, **don't** use the `--keyword` option : it will reuse this saved keywords list. You can use `--keywords-file` to change the default name.
- **--language**, **--languages-file** : you should not need to use these options unless you only want to translate into a given set of languages. The first time the bot runs, it will download the list of supported languages into `languages.json` and reuse it afterwards (or the file indicated with `--languages-file`).
- **--ibmcloud-url** and **--ibmcloud-apikey** can be obtained from your IBM Cloud account ([create a Language Translator instance](https://cloud.ibm.com/apidocs/language-translator) then go to [the resource list](https://cloud.ibm.com/resources?groups=resource-instance))
- **--backend** selects the *chatter* system to use : it currently supports "console" and "signal" (see below)
- **--username** selects the account to use to send and read message ; its format depends on the backend
- **--recipient** and **--group** select the recipient (only one of them should be given) ; its format depends on the backend
The **i18n.\<locale>.yml** file contains localization strings for your locale and fun :
- *Transbot* will say "Hello" when started and "Goodbye" before shutting down : you can configure those banners in this file.
- It also defines the message pattern that terminates the bot.
Finally, see the following chapter about the **config.yml** file.
### Config.yml configuration file
Options can also be taken from a configuration file : by default it reads the `config.yml` file in the current directory but can be changed with the `--config-file` and `--config-dir` options.
This file is in YAML format with all options at the root level. Keys have the same name as command line options, with middle dashes `-` replaced with underscores `_`.
E.g. `--ibmcloud-url https://api...` will become `ibmcloud_url: https://api...`.
A sample configuration is available in the `test/sample-conf/` directory.
Please first review [YAML syntax](https://yaml.org/spec/1.1/#id857168) if you don't know about YAML.
## Using the Signal backend
By using `--backend signal` you can make the bot chat with Signal users.
### Prerequiste
You must first [install and configure *signal-cli*](https://github.com/AsamK/signal-cli#installation).
Then you must [*register* or *link*](https://github.com/AsamK/signal-cli/blob/master/man/signal-cli.1.adoc) the computer when the bot will run ; e.g. :
signal-cli link --name MyComputer
### Parameters
With signal, make sure :
- the `--username` parameter is your phone number in international format (e.g. `+33123456789`). In `config.yml`, make sure to put quotes around it to prevent YAML thinking it's an integer (because of the 'plus' sign)
- specify either `--recipient` as an international phone number or `--group` with a base 64 group ID (e.g. `--group "mABCDNVoEFGz0YeZM1234Q=="`). Once registered with Signal, you can list the IDs of the groups you are in with `signal-cli -u +336123456789 listGroups`
Sample command line to run the bot with Signal :
python3 nicobot/transbot.py -b signal -u +33612345678 -g "mABCDNVoEFGz0YeZM1234Q==" --ibmcloud-url https://api.eu-de.language-translator.watson.cloud.ibm.com/instances/a234567f-4321-abcd-efgh-1234abcd7890 --ibmcloud-apikey "f5sAznhrKQyvBFFaZbtF60m5tzLbqWhyALQawBg5TjRI"
## Resources
### Python libraries
- [xmpppy](https://github.com/xmpppy/xmpppy) : this library is very easy to use but it does allow easy access to thread or timestamp
- [https://lab.louiz.org/poezio/slixmpp](slixmpp) : seems like a cool library too and pretends to require minimal dependencies ; however the quick start example does not work OOTB...
- https://github.com/horazont/aioxmpp : the official library, seems the most complete but misses practical introduction
None of them seems to support OMEMO out of the box :-(
### IBM Cloud
- [Language Translator service](https://cloud.ibm.com/catalog/services/language-translator)
- [Language Translator API documentation](https://cloud.ibm.com/apidocs/language-translator)
### Signal
- [Signal home](https://signal.org/)
- [signal-cli man page](https://github.com/AsamK/signal-cli/blob/master/man/signal-cli.1.adoc)

3
nicobot/__init__.py Normal file
View file

@ -0,0 +1,3 @@
# -*- coding: utf-8 -*-
from bot import Bot
from chatter import Chatter

49
nicobot/bot.py Normal file
View file

@ -0,0 +1,49 @@
# -*- coding: utf-8 -*-
import atexit
import signal
import sys
class Bot:
"""
Bot foundation
"""
def onMessage( self, message ):
"""
Called by self.chatter whenever a message hsa arrived :
if the given message contains any of the keywords in any language,
will answer with a translation in a random language
including the flag of the random language.
message: A plain text message
Returns the crafted translation
"""
pass
def onExit( self ):
"""
Called just before exiting ; the chatter should still be available.
Subclass MUST call registerExitHandler for this to work !
"""
pass
def onSignal( self, sig, frame ):
# Thanks https://stackoverflow.com/questions/23468042/the-invocation-of-signal-handler-and-atexit-handler-in-python
sys.exit(0)
def registerExitHandler( self ):
# Registers exit handlers to properly say goodbye
atexit.register(self.onExit)
# TODO This list does not work on Windows
for sig in [signal.SIGINT, signal.SIGTERM, signal.SIGHUP ]:
signal.signal(sig, self.onSignal)
def run( self ):
"""
Starts the bot
"""
pass

31
nicobot/chatter.py Normal file
View file

@ -0,0 +1,31 @@
# -*- coding: utf-8 -*-
class Chatter:
"""
Bot engine interface
"""
def start( self, bot ):
"""
Waits for messages and calls the 'onMessage' method of the given Bot
"""
pass
def reply( self, source ):
"""
Replies to a specific message or person
"""
pass
def send( self, message ):
"""
Sends the given message using the underlying implemented chat protocol
"""
pass
def stop( self ):
"""
Stops waiting for messages and exits the engine
"""
pass

27
nicobot/console.py Normal file
View file

@ -0,0 +1,27 @@
# -*- coding: utf-8 -*-
import logging
import sys
class ConsoleChatter:
"""
Bot engine that reads from a stream and outputs to another
"""
input = None
output = None
def __init__( self, input=sys.stdin, output=sys.stdout ):
self.input = input
self.output = output
def start( self, bot ):
for line in self.input:
bot.onMessage( line )
def send( self, message ):
print( message, file=self.output, flush=True )
def stop( self ):
sys.exit(0)

197
nicobot/signalcli.py Executable file
View file

@ -0,0 +1,197 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import argparse
import logging
import sys
import os
import shutil
import subprocess
import atexit
import signal
import json
import i18n
import re
import locale
from chatter import Chatter
# Generic timeout for all signal-cli commands
TIMEOUT = 15
# Custom timeout to pass to signal-cli when receiving messages
RECEIVE_TIMEOUT = 5
class SignalChatter(Chatter):
"""
A signal bot relying on signal-cli
"""
def __init__( self, username, recipient=None, group=None, signal_cli=shutil.which("signal-cli") ):
if not username or not signal_cli:
raise ValueError("username and signal_cli must be provided")
if not recipient and not group:
raise ValueError("Either a recipient or a group must be given")
if recipient and group:
raise ValueError("Only one of recipient and group may be given")
self.username = username
self.recipient = recipient
self.group = group
self.signal_cli = signal_cli
# Properties set elsewhere
self.sentTimestamp = None
# If True, will terminate the main loop
self.shutdown = False
self.bot = None
def start( self, bot ):
self.bot = bot
while not self.shutdown:
self.filterMessages( self.receiveMessages() )
def send( self, message ):
cmd = [ self.signal_cli, "-u", self.username, "send", "-m", message ]
if self.recipient:
cmd = cmd + [ self.recipient ]
elif self.group:
cmd = cmd + [ "-g", self.group ]
# throws an error in case of status <> 0
logging.debug(cmd)
proc = subprocess.run( cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, check=True, timeout=TIMEOUT )
logging.debug( ">>> %s" % message )
sent = proc.stdout
logging.debug("Sent message : %s"%repr(sent))
self.sentTimestamp = int(sent)
def reply( self, source ):
# TODO
pass
def stop( self ):
self.shutdown = True
def receiveMessages( self, timeout=RECEIVE_TIMEOUT, input=None ):
cmd = [ self.signal_cli, "-u", self.username, "receive", "--json" ]
if timeout:
cmd = cmd + [ "-t", str(timeout) ]
if not input:
# TODO Pass this log in finer (lower) level as it can be very verbose and unuseful when reading empty responses every few seconds
logging.debug(cmd)
proc = subprocess.Popen( cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE )
input = proc.stdout
events = []
for bline in iter(input.readline, b''):
logging.debug("Read line : %s" % bline)
try:
line = bline.decode()
except (UnicodeDecodeError, AttributeError):
line = bline
events = events + [json.loads(line.rstrip())]
return events
def filterMessages( self, events ):
for event in events:
logging.debug("Filtering message : %s" % repr(event))
envelope = event['envelope']
if envelope['timestamp'] > self.sentTimestamp:
if envelope['dataMessage']:
dataMessage = envelope['dataMessage']
if dataMessage['message']:
message = event['envelope']['dataMessage']['message']
if self.recipient:
if envelope['source'] == self.recipient:
self.bot.onMessage(message)
return True
else:
logging.debug("Discarding message not from recipient %s"%self.recipient)
elif self.group:
if dataMessage['groupInfo'] and dataMessage['groupInfo']['groupId']:
self.bot.onMessage(message)
return True
else:
logging.debug("Discarding message not from group %s" % self.group)
else:
logging.debug("Discarding message without text")
else:
logging.debug("Discarding message without data")
else:
logging.debug("Discarding message that was sent before ours")
return False
if __name__ == '__main__':
""" FIXME This entry point is not working anymore ! """
parser = argparse.ArgumentParser( description='Sends a XMPP message and reads the answer' )
# Core parameters
parser.add_argument('--username', '-u', dest='username', required=True, help="Sender's number (e.g. +12345678901)")
parser.add_argument('--group', '-g', dest='group', help="Group's ID in base64 (e.g. mPC9JNVoKDGz0YeZMsbL1Q==)")
parser.add_argument('--recipient', '-r', dest='recipient', help="Recipient's number (e.g. +12345678901)")
parser.add_argument('--signal-cli', '-s', dest='signal_cli', default=shutil.which("signal-cli"), help="Path to `signal-cli` if not in PATH")
# Misc. options
parser.add_argument("--i18n-dir", "-I", dest="i18n_dir", default=os.path.dirname(os.path.realpath(__file__)), help="Directory where to find translation files. Defaults to this script's directory.")
parser.add_argument('--verbosity', '-V', dest='log_level', default="INFO", help="Log level")
parser.add_argument("--test", '-T', dest="test", action="store_true", default=False, help="Activate test mode")
parser.add_argument('--locale', '-L', dest='locale', default=None, help="Change default locale (e.g. 'fr')")
args = parser.parse_args()
if not args.signal_cli:
raise ValueError("Could not find the 'signal-cli' command in PATH and no --signal-cli given")
if not args.recipient and not args.group:
raise ValueError("Either --recipient or --group must be provided")
# Logging configuration
# TODO Allow for a trace level (high-volume debug)
# TODO How to tag logs from this module so that their level can be tuned specifically ?
logLevel = getattr(logging, args.log_level.upper(), None)
if not isinstance(logLevel, int):
raise ValueError('Invalid log level: %s' % args.log_level)
# Logs are output to stderr ; stdout is reserved to print the answer(s)
logging.basicConfig(level=logLevel, stream=sys.stderr)
logging.debug("Current locale : %s"%repr(locale.getlocale()))
if args.locale:
loc = args.locale
else:
loc = locale.getlocale()[0]
# See https://pypi.org/project/python-i18n/
logging.debug("i18n_dir : %s"%args.i18n_dir)
# FIXME Manually set the locale : how come a Python library named 'i18n' doesn't take into account the Python locale by default ?
i18n.set('locale',loc.split('_')[0])
logging.debug("i18n locale : %s"%i18n.get('locale'))
i18n.set('filename_format', 'i18n.{locale}.{format}') # Removing the namespace is simpler for us
i18n.load_path.append(args.i18n_dir)
# This MUST be instanciated AFTER i18n ha been configured !
RE_SHUTDOWN = re.compile( i18n.t('Shutdown'), re.IGNORECASE )
""" Real start """
bot = SignalChatter( username=args.username, signal_cli=args.signal_cli, recipient=args.recipient, group=args.group )
if args.test:
bot.run(sys.stdin)
else:
bot.run()

479
nicobot/transbot.py Executable file
View file

@ -0,0 +1,479 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Sample bot that translates text whenever it sees a message with one of its keywords.
"""
import argparse
import logging
import sys
import os
import shutil
import json
import i18n
import re
import locale
import requests
import random
# Provides an easy way to get the unicode sequence for country flags
import flag
import yaml
# Own classes
from bot import Bot
from console import ConsoleChatter
from signalcli import SignalChatter
# Default timeout for requests in seconds
# Note : More than 10s recommended (30s ?) on IBM Cloud with a free account
TIMEOUT = 60
# Set to None to translate keywords in all available languages
# Set to something > 0 to limit the number of translations for the keywords (for tests)
LIMIT_KEYWORDS = None
# Default (empty actually) configuration, to ease depth navigation
class Config:
def __init__(self):
self.__dict__.update({
'backend': "console",
'config_file': None,
'config_dir': os.getcwd(),
'group': None,
'ibmcloud_url': None,
'ibmcloud_apikey': None,
'input_file': sys.stdin,
'keywords': [],
'keywords_file': None,
'languages': [],
'languages_file': None,
'locale': None,
'recipient': None,
'shutdown': None,
'signal_cli': shutil.which("signal-cli"),
'username': None,
'verbosity': "INFO"
})
"""
TODO Find a better way to log requests.Response objects
"""
def _logResponse( r ):
logging.debug("<<< Response : %s\tbody: %.60s[...]", repr(r), r.content )
class TransBot(Bot):
"""
Sample bot that translates text.
It only answers to messages containing defined keywords.
It uses IBM Watson Language Translator (see API docs : https://cloud.ibm.com/apidocs/language-translator) to translate the text.
"""
def __init__( self, chatter, ibmcloud_url, ibmcloud_apikey, keywords=None, keywords_file=None, languages=None, languages_file=None, shutdown_pattern=r'bye nicobot' ):
"""
keywords: list of keywords that will trigger this bot (in any supported language)
keywords_file: JSON file where to find the list of keywords (or write into)
languages: List of supported languages in this format : https://cloud.ibm.com/apidocs/language-translator#list-identifiable-languages
languages_file: JSON file where to find the list of target languages (or write into)
shutdown_pattern: a regular expression pattern that terminates this bot
chatter: the backend chat engine
ibmcloud_url (required): IBM Cloud API base URL (e.g. 'https://api.eu-de.language-translator.watson.cloud.ibm.com/instances/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx')
ibmcloud_apikey (required): IBM Cloud API key (e.g. 'dG90byBlc3QgZGFucyBsYSBwbGFjZQo')
store_path: Base directory where to cache files
"""
self.ibmcloud_url = ibmcloud_url
self.ibmcloud_apikey = ibmcloud_apikey
self.chatter = chatter
# After IBM credentials have been set we can retrieve the list of supported languages
if languages:
self.languages = languages
else:
self.languages = self.loadLanguages(file=languages_file)
# How many different languages to try to translate to
self.tries = 3
# After self.languages has been set, we can iterate over to translate keywords
kws = self.loadKeywords( keywords=keywords, file=keywords_file, limit=LIMIT_KEYWORDS )
pattern = kws[0]
for keyword in kws[1:]:
pattern = pattern + r'|' + keyword
# Built regular expression pattern that triggers an answer from this bot
self.re_keywords = pattern
# Regular expression pattern of messages that stop the bot
self.re_shutdown = shutdown_pattern
def loadLanguages( self, force=False, file=None ):
"""
Loads the list of known languages.
Requires the IBM Cloud credentials to be set before !
If force==True then calls the remote service, otherwise reads from the given file if given
"""
# TODO It starts with the same code as in loadKeywords : make it a function
# Gets the list from a local file
if not force and file:
logging.debug("Reading from %s..." % file)
try:
with open(file,'r') as f:
j = json.load(f)
return j['languages']
except:
logging.info("Could not read languages list from %s" % file)
pass
# Else, gets the list from the cloud
# curl --user apikey:{apikey} "{url}/v3/identifiable_languages?version=2018-05-01"
url = "%s/v3/identifiable_languages?version=2018-05-01" % self.ibmcloud_url
headers = {
'Accept': 'application/json',
'X-Watson-Learning-Opt-Out': 'true'
}
logging.debug(">>> GET %s, %s",url,repr(headers))
r = requests.get(url, headers=headers, auth=('apikey',self.ibmcloud_apikey), timeout=TIMEOUT)
_logResponse(r)
if r.status_code == requests.codes.ok:
# Save it for the next time
if file:
try:
logging.debug("Saving languages to %s..." % file)
with open(file,'w') as f:
f.write(r.text)
except:
logging.exception("Could not save the languages list to %s" % file)
pass
else:
logging.debug("Not saving languages as no file was given")
return r.json()['languages']
else:
r.raise_for_status()
def loadKeywords( self, keywords=[], file=None, limit=None ):
"""
Generates a list of translations from a list of keywords.
Requires self.languages to be filled before !
If 'keywords' is not empty, will download the translations from IBM Cloud into 'file'.
Otherwise, will try to read from 'file', falling back to IBM Cloud and saving it into 'file' if it fails.
"""
# TODO It starts with the same code as in loadLanguages : make it a function
# Gets the list from a local file
if not keywords or len(keywords) == 0:
logging.debug("Reading from %s..." % file)
try:
with open(file,'r') as f:
j = json.load(f)
logging.debug("Read keyword list : %s",repr(j))
return j
except:
raise ValueError("Could not read keywords list from %s and no keyword given" % file)
pass
kws = []
for keyword in keywords:
logging.debug("Init %s...",keyword)
kws = kws + [ keyword ]
for lang in self.languages:
# For tests, in order not to use all credits, we can limit the number of calls here
if limit and len(kws) >= limit:
break
try:
translation = self.translate( keyword, target=lang['language'] )
translated = translation['translation'].rstrip()
logging.debug("Adding translation %s in %s for %s", translated, lang, keyword)
kws = kws + [ translated ]
except:
logging.exception("Could not translate %s into %s", keyword, repr(lang))
pass
logging.debug("Keywords : %s", repr(kws))
if file:
try:
logging.debug("Saving keywords translations into %s...", file)
with open(file,'w') as f:
json.dump(kws,f)
except:
logging.exception("Could not save keywords translations into %s", file)
pass
else:
logging.debug("Not saving keywords as no file was given")
return kws
def translate( self, message, target, source=None ):
"""
Translates a given message.
target: Target language short code (e.g. 'en')
source: Source language short code ; if not given will try to guess
Returns the plain translated message or None if no translation could be found.
"""
# curl -X POST -u "apikey:{apikey}" --header "Content-Type: application/json" --data "{\"text\": [\"Hello, world! \", \"How are you?\"], \"model_id\":\"en-es\"}" "{url}/v3/translate?version=2018-05-01"
url = "%s/v3/translate?version=2018-05-01" % self.ibmcloud_url
body = {
"text": [message],
"target": target
}
if source:
body['source'] = source
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'X-Watson-Learning-Opt-Out': 'true'
}
logging.debug(">>> POST %s, %s, %s",url,repr(body),repr(headers))
r = requests.post(url, json=body, headers=headers, auth=('apikey',self.ibmcloud_apikey), timeout=TIMEOUT)
# TODO Log full response when it's usefull (i.e. when a message is going to be answered)
_logResponse(r)
if r.status_code == requests.codes.ok:
j = r.json()
translation = j['translations']
return translation[0]
# A 404 can happen if there is no translation available
elif r.status_code == requests.codes.not_found:
return None
else:
r.raise_for_status()
def onMessage( self, message ):
"""
Called by self.chatter whenever a message hsa arrived :
if the given message contains any of the keywords in any language,
will answer with a translation in a random language
including the flag of the random language.
message: A plain text message
Returns the crafted translation
"""
# FIXME re.compile((i18n.t('Shutdown'),re.IGNORECASE).search(message) does not work
# as expected so we use re.search(...)
if re.search( self.re_shutdown, message, re.IGNORECASE ):
logging.debug("Shutdown asked")
self.chatter.stop()
# Only if the message contains a keyword
elif re.search( self.re_keywords, message, flags=re.IGNORECASE ):
# Selects a few random target languages each time
langs = random.choices( self.languages, k=self.tries )
for lang in langs:
# Gets a translation in this random language
translation = self.translate( message, target=lang['language'] )
if translation:
translated = translation['translation'].rstrip()
try:
lang_emoji = flag.flag(lang['language'])
except ValueError:
lang_emoji= "🏳️‍🌈"
answer = "%s %s" % (translated,lang_emoji)
logging.debug(">> %s" % answer)
self.chatter.send(answer)
# Returns as soon as one translation was done
return
else:
pass
logging.warning("Could not find a translation in %s for %s",repr(langs),message)
else:
logging.debug("Message did not have a keyword")
def onExit( self ):
sent = self.chatter.send( i18n.t('Goodbye') )
def run( self ):
"""
Starts the bot :
1. Sends a hello message
2. Waits for messages to translate
"""
self.chatter.send( i18n.t('Hello') )
self.registerExitHandler()
self.chatter.start(self)
if __name__ == '__main__':
"""
A convenient CLI to play with this bot
"""
#
# Two-pass arguments parsing
#
config = Config()
parser = argparse.ArgumentParser( description="A bot that reacts to messages with given keywords by responding with a random translation" )
# Bootstrap options
parser.add_argument("--config-file", "-c", dest="config_file", help="YAML configuration file.")
parser.add_argument("--config-dir", "-C", dest="config_dir", default=config.config_dir, help="Directory where to find configuration, cache and translation files by default.")
parser.add_argument('--verbosity', '-V', dest='verbosity', default=config.verbosity, help="Log level")
# Core arguments
parser.add_argument("--keyword", "-k", dest="keywords", action="append", help="Keyword bot should react to (will write them into the file specified with --keywords-file)")
parser.add_argument("--keywords-file", dest="keywords_file", help="File to load from and write keywords to")
parser.add_argument("--language", "-l", dest="languages", action="append", help="Target language")
parser.add_argument("--languages-file", dest="languages_file", help="File to load from and write languages to")
parser.add_argument("--shutdown", dest="shutdown", help="Shutdown keyword regular expression pattern")
parser.add_argument("--ibmcloud-url", dest="ibmcloud_url", help="IBM Cloud API base URL (get it from your resource https://cloud.ibm.com/resources)")
parser.add_argument("--ibmcloud-apikey", dest="ibmcloud_apikey", help="IBM Cloud API key (get it from your resource : https://cloud.ibm.com/resources)")
# Chatter-generic arguments
parser.add_argument("--backend", "-b", dest="backend", choices=["signal","console"], default=config.backend, help="Chat backend to use")
parser.add_argument("--input-file", "-i", dest="input_file", default=config.input_file, help="File to read messages from (one per line)")
parser.add_argument('--username', '-u', dest='username', help="Sender's number (e.g. +12345678901 for the 'signal' backend)")
parser.add_argument('--group', '-g', dest='group', help="Group's ID in base64 (e.g. 'mPC9JNVoKDGz0YeZMsbL1Q==' for the 'signal' backend)")
parser.add_argument('--recipient', '-r', dest='recipient', help="Recipient's number (e.g. +12345678901)")
# Signal-specific arguments
parser.add_argument('--signal-cli', dest='signal_cli', default=config.signal_cli, help="Path to `signal-cli` if not in PATH")
# Misc. options
parser.add_argument('--locale', '-L', dest='locale', default=config.locale, help="Change default locale (e.g. 'fr')")
#
# 1st pass only matters for 'bootstrap' options : configuration file and logging
#
parser.parse_args(namespace=config)
# Logging configuration
logLevel = getattr(logging, config.verbosity.upper(), None)
if not isinstance(logLevel, int):
raise ValueError('Invalid log level: %s' % config.verbosity)
# Logs are output to stderr ; stdout is reserved to print the answer(s)
logging.basicConfig(level=logLevel, stream=sys.stderr)
logging.debug( "Configuration for bootstrap : %s", repr(vars(config)) )
# Loads the config file that will be used to lookup some missing parameters
if not config.config_file:
config.config_file = os.path.join(config.config_dir,"config.yml")
try:
with open(config.config_file,'r') as file:
# The FullLoader parameter handles the conversion from YAML
# scalar values to Python the dictionary format
dictConfig = yaml.full_load(file)
logging.debug("Successfully loaded configuration from %s : %s" % (config.config_file,repr(dictConfig)))
config.__dict__.update(dictConfig)
except:
pass
# From here the config object has only the default values for all configuration options
#logging.debug( "Configuration after bootstrap : %s", repr(vars(config)) )
#
# 2nd pass parses all options
#
# Updates the existing config object with all parsed options
parser.parse_args(namespace=config)
logging.debug( "Final configuration : %s", repr(vars(config)) )
#
# From here the config object has default options from:
# 1. hard-coded default values
# 2. configuration file overrides
# 3. command line overrides
#
# We can check the required options that could not be checked before
# (because required arguments may have been set from the config file and not on the command line)
#
# i18n + l10n
logging.debug("Current locale : %s"%repr(locale.getlocale()))
if not config.locale:
config.locale = locale.getlocale()[0]
# See https://pypi.org/project/python-i18n/
# FIXME Manually sets the locale : how come a Python library named 'i18n' doesn't take into account the Python locale by default ?
i18n.set('locale',config.locale.split('_')[0])
logging.debug("i18n locale : %s"%i18n.get('locale'))
i18n.set('filename_format', 'i18n.{locale}.{format}') # Removing the namespace from keys is simpler for us
i18n.load_path.append(config.config_dir)
if not config.ibmcloud_url:
raise ValueError("Missing required parameter : --ibmcloud-url")
if not config.ibmcloud_apikey:
raise ValueError("Missing required parameter : --ibmcloud-apikey")
# config.keywords is used if given
# else, check for an existing keywords_file
if not config.keywords_file:
# As a last resort, use 'keywords.json' in the config directory
config.keywords_file = os.path.join(config.config_dir,'keywords.json')
# Convenience check to better warn the user
if not config.keywords:
try:
with open(config.keywords_file,'r') as f:
pass
except:
raise ValueError("Could not open %s : please generate with --keywords first or create the file indicated with --keywords-file"%config.keywords_file)
# config.languages is used if given
# else, check for an existing languages_file
if not config.languages_file:
# As a last resort, use 'keywords.json' in the config directory
config.languages_file = os.path.join(config.config_dir,'languages.json')
# Convenience check to better warn the user
if not config.languages:
try:
with open(config.languages_file,'r') as f:
pass
except:
raise ValueError("Could not open %s : please remove --languages to generate it automatically or create the file indicated with --languages-file"%config.languages_file)
if not config.shutdown:
# This MUST be instanciated AFTER i18n has been configured !
config.shutdown = i18n.t('Shutdown')
# Creates the chat engine depending on the 'backend' parameter
if config.backend == "signal":
if not config.signal_cli:
raise ValueError("Could not find the 'signal-cli' command in PATH and no --signal-cli given")
if not config.username:
raise ValueError("Missing a username")
if not config.recipient and not config.group:
raise ValueError("Either --recipient or --group must be provided")
chatter = SignalChatter(
username=config.username,
recipient=config.recipient,
group=config.group,
signal_cli=config.signal_cli)
# By default (or if backend == "console"), will read from stdin or a given file and output to console
else:
chatter = ConsoleChatter(config.input_file,sys.stdout)
#
# Real start
#
TransBot(
keywords=config.keywords, keywords_file=config.keywords_file,
languages=config.languages, languages_file=config.languages_file,
ibmcloud_url=config.ibmcloud_url, ibmcloud_apikey=config.ibmcloud_apikey,
shutdown_pattern=config.shutdown,
chatter=chatter
).run()

8
requirements.txt Normal file
View file

@ -0,0 +1,8 @@
# Requirements for nicobot.py
# https://requests.readthedocs.io/en/master/
requests
# https://github.com/cvzi/flag
emoji-country-flag
python-i18n
# https://pyyaml.org/wiki/PyYAMLDocumentation
pyyaml

View file

@ -0,0 +1,14 @@
# IBM Cloud credentials for your 'Language Translator' service instance : get them from your
# See detailed instructions : https://cloud.ibm.com/apidocs/language-translator
ibmcloud_url: https://api.us-south.language-translator.watson.cloud.ibm.com/instances/6bbda3b3-d572-45e1-8c54-22d6ed9e52c2
ibmcloud_apikey: "f5sAznhrKQyvBFFaZbtF60m5tzLbqWhyALQawBg5TjRI"
backend: console
#backend: signal
# Signal credentials
# Make sure to put quotes around the username field as it is a phone number for Signal
username: "+33123456789"
recipient: "+33123456789"
# Get this group ID with the command `signal-cli -u +33123456789 listGroups`
#group: "mABCDNVoEFGz0YeZM1234Q=="

View file

@ -0,0 +1,4 @@
en:
Hello: 🤟 nicobot ready 🤟
Goodbye: See you later 👋
Shutdown: bye nicobot

View file

@ -0,0 +1,4 @@
fr:
Hello: "🤟 nicobot paré 🤟"
Goodbye: A+ 👋
Shutdown: couché nicobot

View file

@ -0,0 +1 @@
["bonjour", "\u0645\u0631\u062d\u0628\u0627", "\u0410\u043b\u043e?", "\u09b9\u09cd\u09af\u09be\u09b2\u09cb", "Ahoj.", "Hallo?", "Guten Tag", "\u0395\u03bc\u03c0\u03c1\u03cc\u03c2!", "Hello", "hola", "Tere.", "Hei", "Hello", "\u0ab9\u0ac7\u0ab2\u0acb\u0ab5", "\u05d4\u05dc\u05d5", "\u0928\u092e\u0938\u094d\u0915\u093e\u0930", "Zdravo.", "Hello", "Ciao", "\u30cf\u30ed\u30fc", "\uc548\ub155\ud558\uc138\uc694", "Labas.", "Sveika.", "\u0d39\u0d32\u0d4b", "Hello.", "Hello", "Hallo", "\u0939\u0947\u0932\u094b", "Hallo", "Witaj", "Ol\u00e1", "Salut", "\u0417\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439\u0442\u0435.", "\u0dc4\u0dd9\u0dbd\u0ddd", "Ahoj.", "Zdravo.", "Hej.", "\u0bb9\u0bb2\u0bcb", "\u0c39\u0c32\u0c4b", "\u0e2a\u0e27\u0e31\u0e2a\u0e14\u0e35", "Merhaba.", "\u0633\u0644\u0627\u0645", "Xin ch\u00e0o", "\u4f60\u597d", "\u4f60\u597d", "coucou", "\u0645\u0633\u062d\u0648\u0631", "\"", "\u0995\u09c1\u0981\u099a\u09be\u09a8\u09cb", "scelov\u00e1no", "couched", "Couched", "- ...", "couched", "conectado", "kubitud", "Coutettava", "Couch\u00e9", "@ info", "\u0a9c\u0acb\u0aa1\u0ac7\u0ab2", "\u05de\u05e6\u05e8\u05e3", "\u0926\u093f\u092f\u093e \u0939\u0941\u0906", "spojeno", "Kusz\u00e1lt", "accoppiata", "\u30af\u30c1\u30c9", "\ucfe8\ud558\uac8c", "kuita", "1.", "\u0d15\u0d1a\u0d4d\u0d1a\u0d35\u0d1f\u0d02", "Kuy", "kuffar", "hovet", "\u091c\u094b\u0930\u0926\u093e\u0930", "Couches", "przewr\u00f3cony", "cucut\u0103", "\u041a\u0430\u0448\u0435\u043b\u044c", "\u0d9a\u0daf\u0dc0\u0dd4\u0dbb", "pre\u0165ahovan\u00e9", "couched", "couched", "\u0b95\u0bc2\u0bae\u0bcd\u0baa\u0bc1", "\u0c15\u0c42\u0c30\u0c4d\u0c2a\u0c41", "\u0e16\u0e39\u0e01\u0e1e\u0e31\u0e01\u0e44\u0e27\u0e49", "kanepeden", "\u06a9\u0648\u062a", "C\u00f3", "\u5e93\u7126", "\u9999", "salut", "hello"]

View file

@ -0,0 +1,228 @@
{
"languages" : [ {
"language" : "af",
"name" : "Afrikaans"
}, {
"language" : "ar",
"name" : "Arabic"
}, {
"language" : "az",
"name" : "Azerbaijani"
}, {
"language" : "ba",
"name" : "Bashkir"
}, {
"language" : "be",
"name" : "Belarusian"
}, {
"language" : "bg",
"name" : "Bulgarian"
}, {
"language" : "bn",
"name" : "Bengali"
}, {
"language" : "ca",
"name" : "Catalan"
}, {
"language" : "cs",
"name" : "Czech"
}, {
"language" : "cv",
"name" : "Chuvash"
}, {
"language" : "da",
"name" : "Danish"
}, {
"language" : "de",
"name" : "German"
}, {
"language" : "el",
"name" : "Greek"
}, {
"language" : "en",
"name" : "English"
}, {
"language" : "eo",
"name" : "Esperanto"
}, {
"language" : "es",
"name" : "Spanish"
}, {
"language" : "et",
"name" : "Estonian"
}, {
"language" : "eu",
"name" : "Basque"
}, {
"language" : "fa",
"name" : "Persian"
}, {
"language" : "fi",
"name" : "Finnish"
}, {
"language" : "fr",
"name" : "French"
}, {
"language" : "ga",
"name" : "Irish"
}, {
"language" : "gu",
"name" : "Gujarati"
}, {
"language" : "he",
"name" : "Hebrew"
}, {
"language" : "hi",
"name" : "Hindi"
}, {
"language" : "hr",
"name" : "Croatian"
}, {
"language" : "ht",
"name" : "Haitian"
}, {
"language" : "hu",
"name" : "Hungarian"
}, {
"language" : "hy",
"name" : "Armenian"
}, {
"language" : "is",
"name" : "Icelandic"
}, {
"language" : "it",
"name" : "Italian"
}, {
"language" : "ja",
"name" : "Japanese"
}, {
"language" : "ka",
"name" : "Georgian"
}, {
"language" : "kk",
"name" : "Kazakh"
}, {
"language" : "km",
"name" : "Central Khmer"
}, {
"language" : "ko",
"name" : "Korean"
}, {
"language" : "ku",
"name" : "Kurdish"
}, {
"language" : "ky",
"name" : "Kirghiz"
}, {
"language" : "lo",
"name" : "Lao"
}, {
"language" : "lt",
"name" : "Lithuanian"
}, {
"language" : "lv",
"name" : "Latvian"
}, {
"language" : "ml",
"name" : "Malayalam"
}, {
"language" : "mn",
"name" : "Mongolian"
}, {
"language" : "mr",
"name" : "Marathi"
}, {
"language" : "ms",
"name" : "Malay"
}, {
"language" : "mt",
"name" : "Maltese"
}, {
"language" : "my",
"name" : "Burmese"
}, {
"language" : "nb",
"name" : "Norwegian Bokmal"
}, {
"language" : "ne",
"name" : "Nepali"
}, {
"language" : "nl",
"name" : "Dutch"
}, {
"language" : "nn",
"name" : "Norwegian Nynorsk"
}, {
"language" : "pa",
"name" : "Panjabi"
}, {
"language" : "pa-PK",
"name" : "Panjabi (Shahmukhi script, Pakistan)"
}, {
"language" : "pl",
"name" : "Polish"
}, {
"language" : "ps",
"name" : "Pushto"
}, {
"language" : "pt",
"name" : "Portuguese"
}, {
"language" : "ro",
"name" : "Romanian"
}, {
"language" : "ru",
"name" : "Russian"
}, {
"language" : "si",
"name" : "Sinhala"
}, {
"language" : "sk",
"name" : "Slovakian"
}, {
"language" : "sl",
"name" : "Slovenian"
}, {
"language" : "so",
"name" : "Somali"
}, {
"language" : "sq",
"name" : "Albanian"
}, {
"language" : "sr",
"name" : "Serbian"
}, {
"language" : "sv",
"name" : "Swedish"
}, {
"language" : "ta",
"name" : "Tamil"
}, {
"language" : "te",
"name" : "Telugu"
}, {
"language" : "th",
"name" : "Thai"
}, {
"language" : "tl",
"name" : "Tagalog"
}, {
"language" : "tr",
"name" : "Turkish"
}, {
"language" : "uk",
"name" : "Ukrainian"
}, {
"language" : "ur",
"name" : "Urdu"
}, {
"language" : "vi",
"name" : "Vietnamese"
}, {
"language" : "zh",
"name" : "Simplified Chinese"
}, {
"language" : "zh-TW",
"name" : "Traditional Chinese"
} ]
}