diff --git a/.gitignore b/.gitignore index b6e4761..7f9f09a 100644 --- a/.gitignore +++ b/.gitignore @@ -127,3 +127,7 @@ dmypy.json # Pyre type checker .pyre/ + +# All local-only / private files +**/local* +**/priv* diff --git a/README.md b/README.md index 51e35cf..c64615a 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,124 @@ -# nicobot-jabber -A chat bot that can ask something over XMPP / Jabber and wait for an answer +# nicobot + +🤟 A collection of *cool* chat bots 🤟 + +It features : + +- Participating in [Signal](https://www.signal.org/fr/) conversations +- Using [IBM Watson™ Language Translator](https://cloud.ibm.com/apidocs/language-translator) cloud API + + +## Requirements & installation + +Requires : + +- Python 3 +- [signal-cli](https://github.com/AsamK/signal-cli) (for the *Signal* backend) +- An IBM Cloud account ([free account ok](https://www.ibm.com/cloud/free)) + +Install Python dependencies with : + + pip3 install -r requirements.txt + +See below for Signal requirements. + + +## Transbot + +*Transbot* is a chatbot interface to IBM Watson™ Language Translator service that translates messages. +Whenever it sees a keyword in a conversation, *transbot* will translate the whole message into a random language. + +### Quick start + +1. [Create a *Language Translator* service instance on IBM Cloud](https://cloud.ibm.com/catalog/services/language-translator) and [get the URL and API key from your console](https://cloud.ibm.com/resources?groups=resource-instance) +2. Fill them into `test/sample-conf/config.yml` (`ibmcloud_url` and `ibmcloud_apikey`) +3. Run `python3 nicobot/transbot.py -C test/sample-conf` +4. Input `Hello world` in the console : the bot will print a random translation of "Hello World" +5. Input `Bye nicobot` : the bot will terminate + +If you want to send & receive messages through *Signal* instead of reading from the keyboard & printing to the console : + +1. Install and configure `signal-cli` (see below for details) +2. Run `python3 nicobot/transbot.py -C test/sample-conf -b signal -u '+33123456789' -r '+34987654321'` with `-u +33123456789` your *Signal* number and `-r +33987654321` the one of the person you want to make the bot chat with + +See below for more options... + + +### Main configuration options and files + +Run `transbot.py -h` to get a description of all options. +A sample configuration is available in the `test/sample-conf/` directory. + +Below are the most important configuration options : + +- **--config-file** and **--config-dir** let you change the default configuration directory and file. All configuration files will be looked up from this directory ; `--config-file` allows overriding the location of `config.yml`. +- **--keyword** and **--keywords-file** will help you generate the list of keywords that will trigger the bot. To do this, run `transbot.py --keyword --keyword ...` a **first time with** : this will download all known translations for these keywords and save them into a `keywords.json` file. Next time you run the bot, **don't** use the `--keyword` option : it will reuse this saved keywords list. You can use `--keywords-file` to change the default name. +- **--language**, **--languages-file** : you should not need to use these options unless you only want to translate into a given set of languages. The first time the bot runs, it will download the list of supported languages into `languages.json` and reuse it afterwards (or the file indicated with `--languages-file`). +- **--ibmcloud-url** and **--ibmcloud-apikey** can be obtained from your IBM Cloud account ([create a Language Translator instance](https://cloud.ibm.com/apidocs/language-translator) then go to [the resource list](https://cloud.ibm.com/resources?groups=resource-instance)) +- **--backend** selects the *chatter* system to use : it currently supports "console" and "signal" (see below) +- **--username** selects the account to use to send and read message ; its format depends on the backend +- **--recipient** and **--group** select the recipient (only one of them should be given) ; its format depends on the backend + +The **i18n.\.yml** file contains localization strings for your locale and fun : +- *Transbot* will say "Hello" when started and "Goodbye" before shutting down : you can configure those banners in this file. +- It also defines the message pattern that terminates the bot. + +Finally, see the following chapter about the **config.yml** file. + + +### Config.yml configuration file + +Options can also be taken from a configuration file : by default it reads the `config.yml` file in the current directory but can be changed with the `--config-file` and `--config-dir` options. +This file is in YAML format with all options at the root level. Keys have the same name as command line options, with middle dashes `-` replaced with underscores `_`. + +E.g. `--ibmcloud-url https://api...` will become `ibmcloud_url: https://api...`. + +A sample configuration is available in the `test/sample-conf/` directory. + +Please first review [YAML syntax](https://yaml.org/spec/1.1/#id857168) if you don't know about YAML. + + +## Using the Signal backend + +By using `--backend signal` you can make the bot chat with Signal users. + +### Prerequiste + +You must first [install and configure *signal-cli*](https://github.com/AsamK/signal-cli#installation). + +Then you must [*register* or *link*](https://github.com/AsamK/signal-cli/blob/master/man/signal-cli.1.adoc) the computer when the bot will run ; e.g. : + + signal-cli link --name MyComputer + +### Parameters + +With signal, make sure : + +- the `--username` parameter is your phone number in international format (e.g. `+33123456789`). In `config.yml`, make sure to put quotes around it to prevent YAML thinking it's an integer (because of the 'plus' sign) +- specify either `--recipient` as an international phone number or `--group` with a base 64 group ID (e.g. `--group "mABCDNVoEFGz0YeZM1234Q=="`). Once registered with Signal, you can list the IDs of the groups you are in with `signal-cli -u +336123456789 listGroups` + +Sample command line to run the bot with Signal : + + python3 nicobot/transbot.py -b signal -u +33612345678 -g "mABCDNVoEFGz0YeZM1234Q==" --ibmcloud-url https://api.eu-de.language-translator.watson.cloud.ibm.com/instances/a234567f-4321-abcd-efgh-1234abcd7890 --ibmcloud-apikey "f5sAznhrKQyvBFFaZbtF60m5tzLbqWhyALQawBg5TjRI" + + + +## Resources + +### Python libraries + +- [xmpppy](https://github.com/xmpppy/xmpppy) : this library is very easy to use but it does allow easy access to thread or timestamp +- [https://lab.louiz.org/poezio/slixmpp](slixmpp) : seems like a cool library too and pretends to require minimal dependencies ; however the quick start example does not work OOTB... +- https://github.com/horazont/aioxmpp : the official library, seems the most complete but misses practical introduction + +None of them seems to support OMEMO out of the box :-( + +### IBM Cloud + +- [Language Translator service](https://cloud.ibm.com/catalog/services/language-translator) +- [Language Translator API documentation](https://cloud.ibm.com/apidocs/language-translator) + +### Signal + +- [Signal home](https://signal.org/) +- [signal-cli man page](https://github.com/AsamK/signal-cli/blob/master/man/signal-cli.1.adoc) diff --git a/nicobot/__init__.py b/nicobot/__init__.py new file mode 100644 index 0000000..6817384 --- /dev/null +++ b/nicobot/__init__.py @@ -0,0 +1,3 @@ +# -*- coding: utf-8 -*- +from bot import Bot +from chatter import Chatter diff --git a/nicobot/bot.py b/nicobot/bot.py new file mode 100644 index 0000000..91ab627 --- /dev/null +++ b/nicobot/bot.py @@ -0,0 +1,49 @@ +# -*- coding: utf-8 -*- + +import atexit +import signal +import sys + +class Bot: + """ + Bot foundation + """ + + def onMessage( self, message ): + """ + Called by self.chatter whenever a message hsa arrived : + if the given message contains any of the keywords in any language, + will answer with a translation in a random language + including the flag of the random language. + + message: A plain text message + Returns the crafted translation + """ + pass + + + def onExit( self ): + """ + Called just before exiting ; the chatter should still be available. + Subclass MUST call registerExitHandler for this to work ! + """ + pass + + + def onSignal( self, sig, frame ): + # Thanks https://stackoverflow.com/questions/23468042/the-invocation-of-signal-handler-and-atexit-handler-in-python + sys.exit(0) + + + def registerExitHandler( self ): + # Registers exit handlers to properly say goodbye + atexit.register(self.onExit) + # TODO This list does not work on Windows + for sig in [signal.SIGINT, signal.SIGTERM, signal.SIGHUP ]: + signal.signal(sig, self.onSignal) + + def run( self ): + """ + Starts the bot + """ + pass diff --git a/nicobot/chatter.py b/nicobot/chatter.py new file mode 100644 index 0000000..b98da4e --- /dev/null +++ b/nicobot/chatter.py @@ -0,0 +1,31 @@ +# -*- coding: utf-8 -*- + + +class Chatter: + """ + Bot engine interface + """ + + def start( self, bot ): + """ + Waits for messages and calls the 'onMessage' method of the given Bot + """ + pass + + def reply( self, source ): + """ + Replies to a specific message or person + """ + pass + + def send( self, message ): + """ + Sends the given message using the underlying implemented chat protocol + """ + pass + + def stop( self ): + """ + Stops waiting for messages and exits the engine + """ + pass diff --git a/nicobot/console.py b/nicobot/console.py new file mode 100644 index 0000000..f95b63d --- /dev/null +++ b/nicobot/console.py @@ -0,0 +1,27 @@ +# -*- coding: utf-8 -*- + +import logging +import sys + + +class ConsoleChatter: + """ + Bot engine that reads from a stream and outputs to another + """ + + input = None + output = None + + def __init__( self, input=sys.stdin, output=sys.stdout ): + self.input = input + self.output = output + + def start( self, bot ): + for line in self.input: + bot.onMessage( line ) + + def send( self, message ): + print( message, file=self.output, flush=True ) + + def stop( self ): + sys.exit(0) diff --git a/nicobot/signalcli.py b/nicobot/signalcli.py new file mode 100755 index 0000000..c764059 --- /dev/null +++ b/nicobot/signalcli.py @@ -0,0 +1,197 @@ +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- + +import argparse +import logging +import sys +import os +import shutil +import subprocess +import atexit +import signal +import json +import i18n +import re +import locale + +from chatter import Chatter + +# Generic timeout for all signal-cli commands +TIMEOUT = 15 +# Custom timeout to pass to signal-cli when receiving messages +RECEIVE_TIMEOUT = 5 + + +class SignalChatter(Chatter): + """ + A signal bot relying on signal-cli + """ + + def __init__( self, username, recipient=None, group=None, signal_cli=shutil.which("signal-cli") ): + + if not username or not signal_cli: + raise ValueError("username and signal_cli must be provided") + if not recipient and not group: + raise ValueError("Either a recipient or a group must be given") + if recipient and group: + raise ValueError("Only one of recipient and group may be given") + + self.username = username + self.recipient = recipient + self.group = group + self.signal_cli = signal_cli + + # Properties set elsewhere + self.sentTimestamp = None + # If True, will terminate the main loop + self.shutdown = False + self.bot = None + + + def start( self, bot ): + + self.bot = bot + + while not self.shutdown: + self.filterMessages( self.receiveMessages() ) + + + def send( self, message ): + + cmd = [ self.signal_cli, "-u", self.username, "send", "-m", message ] + if self.recipient: + cmd = cmd + [ self.recipient ] + elif self.group: + cmd = cmd + [ "-g", self.group ] + + # throws an error in case of status <> 0 + logging.debug(cmd) + proc = subprocess.run( cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, check=True, timeout=TIMEOUT ) + logging.debug( ">>> %s" % message ) + + sent = proc.stdout + logging.debug("Sent message : %s"%repr(sent)) + self.sentTimestamp = int(sent) + + + def reply( self, source ): + # TODO + pass + + + def stop( self ): + + self.shutdown = True + + + def receiveMessages( self, timeout=RECEIVE_TIMEOUT, input=None ): + + cmd = [ self.signal_cli, "-u", self.username, "receive", "--json" ] + if timeout: + cmd = cmd + [ "-t", str(timeout) ] + + if not input: + # TODO Pass this log in finer (lower) level as it can be very verbose and unuseful when reading empty responses every few seconds + logging.debug(cmd) + proc = subprocess.Popen( cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE ) + input = proc.stdout + events = [] + for bline in iter(input.readline, b''): + logging.debug("Read line : %s" % bline) + try: + line = bline.decode() + except (UnicodeDecodeError, AttributeError): + line = bline + events = events + [json.loads(line.rstrip())] + + return events + + + def filterMessages( self, events ): + + for event in events: + logging.debug("Filtering message : %s" % repr(event)) + envelope = event['envelope'] + if envelope['timestamp'] > self.sentTimestamp: + if envelope['dataMessage']: + dataMessage = envelope['dataMessage'] + if dataMessage['message']: + message = event['envelope']['dataMessage']['message'] + if self.recipient: + if envelope['source'] == self.recipient: + self.bot.onMessage(message) + return True + else: + logging.debug("Discarding message not from recipient %s"%self.recipient) + elif self.group: + if dataMessage['groupInfo'] and dataMessage['groupInfo']['groupId']: + self.bot.onMessage(message) + return True + else: + logging.debug("Discarding message not from group %s" % self.group) + else: + logging.debug("Discarding message without text") + else: + logging.debug("Discarding message without data") + else: + logging.debug("Discarding message that was sent before ours") + + return False + + + +if __name__ == '__main__': + + """ FIXME This entry point is not working anymore ! """ + + parser = argparse.ArgumentParser( description='Sends a XMPP message and reads the answer' ) + # Core parameters + parser.add_argument('--username', '-u', dest='username', required=True, help="Sender's number (e.g. +12345678901)") + parser.add_argument('--group', '-g', dest='group', help="Group's ID in base64 (e.g. mPC9JNVoKDGz0YeZMsbL1Q==)") + parser.add_argument('--recipient', '-r', dest='recipient', help="Recipient's number (e.g. +12345678901)") + parser.add_argument('--signal-cli', '-s', dest='signal_cli', default=shutil.which("signal-cli"), help="Path to `signal-cli` if not in PATH") + # Misc. options + parser.add_argument("--i18n-dir", "-I", dest="i18n_dir", default=os.path.dirname(os.path.realpath(__file__)), help="Directory where to find translation files. Defaults to this script's directory.") + parser.add_argument('--verbosity', '-V', dest='log_level', default="INFO", help="Log level") + parser.add_argument("--test", '-T', dest="test", action="store_true", default=False, help="Activate test mode") + parser.add_argument('--locale', '-L', dest='locale', default=None, help="Change default locale (e.g. 'fr')") + args = parser.parse_args() + + if not args.signal_cli: + raise ValueError("Could not find the 'signal-cli' command in PATH and no --signal-cli given") + + if not args.recipient and not args.group: + raise ValueError("Either --recipient or --group must be provided") + + # Logging configuration + # TODO Allow for a trace level (high-volume debug) + # TODO How to tag logs from this module so that their level can be tuned specifically ? + logLevel = getattr(logging, args.log_level.upper(), None) + if not isinstance(logLevel, int): + raise ValueError('Invalid log level: %s' % args.log_level) + # Logs are output to stderr ; stdout is reserved to print the answer(s) + logging.basicConfig(level=logLevel, stream=sys.stderr) + + logging.debug("Current locale : %s"%repr(locale.getlocale())) + if args.locale: + loc = args.locale + else: + loc = locale.getlocale()[0] + + # See https://pypi.org/project/python-i18n/ + logging.debug("i18n_dir : %s"%args.i18n_dir) + # FIXME Manually set the locale : how come a Python library named 'i18n' doesn't take into account the Python locale by default ? + i18n.set('locale',loc.split('_')[0]) + logging.debug("i18n locale : %s"%i18n.get('locale')) + i18n.set('filename_format', 'i18n.{locale}.{format}') # Removing the namespace is simpler for us + i18n.load_path.append(args.i18n_dir) + + # This MUST be instanciated AFTER i18n ha been configured ! + RE_SHUTDOWN = re.compile( i18n.t('Shutdown'), re.IGNORECASE ) + + """ Real start """ + bot = SignalChatter( username=args.username, signal_cli=args.signal_cli, recipient=args.recipient, group=args.group ) + if args.test: + bot.run(sys.stdin) + else: + bot.run() diff --git a/nicobot/transbot.py b/nicobot/transbot.py new file mode 100755 index 0000000..fe0013b --- /dev/null +++ b/nicobot/transbot.py @@ -0,0 +1,479 @@ +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- + +""" + Sample bot that translates text whenever it sees a message with one of its keywords. +""" + +import argparse +import logging +import sys +import os +import shutil +import json +import i18n +import re +import locale +import requests +import random +# Provides an easy way to get the unicode sequence for country flags +import flag +import yaml + +# Own classes +from bot import Bot +from console import ConsoleChatter +from signalcli import SignalChatter + + +# Default timeout for requests in seconds +# Note : More than 10s recommended (30s ?) on IBM Cloud with a free account +TIMEOUT = 60 + +# Set to None to translate keywords in all available languages +# Set to something > 0 to limit the number of translations for the keywords (for tests) +LIMIT_KEYWORDS = None + +# Default (empty actually) configuration, to ease depth navigation +class Config: + + def __init__(self): + self.__dict__.update({ + 'backend': "console", + 'config_file': None, + 'config_dir': os.getcwd(), + 'group': None, + 'ibmcloud_url': None, + 'ibmcloud_apikey': None, + 'input_file': sys.stdin, + 'keywords': [], + 'keywords_file': None, + 'languages': [], + 'languages_file': None, + 'locale': None, + 'recipient': None, + 'shutdown': None, + 'signal_cli': shutil.which("signal-cli"), + 'username': None, + 'verbosity': "INFO" + }) + + +""" + TODO Find a better way to log requests.Response objects +""" +def _logResponse( r ): + logging.debug("<<< Response : %s\tbody: %.60s[...]", repr(r), r.content ) + + + +class TransBot(Bot): + """ + Sample bot that translates text. + + It only answers to messages containing defined keywords. + It uses IBM Watson™ Language Translator (see API docs : https://cloud.ibm.com/apidocs/language-translator) to translate the text. + """ + + + def __init__( self, chatter, ibmcloud_url, ibmcloud_apikey, keywords=None, keywords_file=None, languages=None, languages_file=None, shutdown_pattern=r'bye nicobot' ): + """ + keywords: list of keywords that will trigger this bot (in any supported language) + keywords_file: JSON file where to find the list of keywords (or write into) + languages: List of supported languages in this format : https://cloud.ibm.com/apidocs/language-translator#list-identifiable-languages + languages_file: JSON file where to find the list of target languages (or write into) + shutdown_pattern: a regular expression pattern that terminates this bot + chatter: the backend chat engine + ibmcloud_url (required): IBM Cloud API base URL (e.g. 'https://api.eu-de.language-translator.watson.cloud.ibm.com/instances/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx') + ibmcloud_apikey (required): IBM Cloud API key (e.g. 'dG90byBlc3QgZGFucyBsYSBwbGFjZQo') + store_path: Base directory where to cache files + """ + + self.ibmcloud_url = ibmcloud_url + self.ibmcloud_apikey = ibmcloud_apikey + self.chatter = chatter + + # After IBM credentials have been set we can retrieve the list of supported languages + if languages: + self.languages = languages + else: + self.languages = self.loadLanguages(file=languages_file) + # How many different languages to try to translate to + self.tries = 3 + + # After self.languages has been set, we can iterate over to translate keywords + kws = self.loadKeywords( keywords=keywords, file=keywords_file, limit=LIMIT_KEYWORDS ) + pattern = kws[0] + for keyword in kws[1:]: + pattern = pattern + r'|' + keyword + # Built regular expression pattern that triggers an answer from this bot + self.re_keywords = pattern + # Regular expression pattern of messages that stop the bot + self.re_shutdown = shutdown_pattern + + + def loadLanguages( self, force=False, file=None ): + """ + Loads the list of known languages. + + Requires the IBM Cloud credentials to be set before ! + + If force==True then calls the remote service, otherwise reads from the given file if given + """ + + # TODO It starts with the same code as in loadKeywords : make it a function + + # Gets the list from a local file + if not force and file: + logging.debug("Reading from %s..." % file) + try: + with open(file,'r') as f: + j = json.load(f) + return j['languages'] + except: + logging.info("Could not read languages list from %s" % file) + pass + + # Else, gets the list from the cloud + # curl --user apikey:{apikey} "{url}/v3/identifiable_languages?version=2018-05-01" + url = "%s/v3/identifiable_languages?version=2018-05-01" % self.ibmcloud_url + headers = { + 'Accept': 'application/json', + 'X-Watson-Learning-Opt-Out': 'true' + } + logging.debug(">>> GET %s, %s",url,repr(headers)) + r = requests.get(url, headers=headers, auth=('apikey',self.ibmcloud_apikey), timeout=TIMEOUT) + _logResponse(r) + if r.status_code == requests.codes.ok: + # Save it for the next time + if file: + try: + logging.debug("Saving languages to %s..." % file) + with open(file,'w') as f: + f.write(r.text) + except: + logging.exception("Could not save the languages list to %s" % file) + pass + else: + logging.debug("Not saving languages as no file was given") + return r.json()['languages'] + else: + r.raise_for_status() + + + def loadKeywords( self, keywords=[], file=None, limit=None ): + """ + Generates a list of translations from a list of keywords. + + Requires self.languages to be filled before ! + + If 'keywords' is not empty, will download the translations from IBM Cloud into 'file'. + Otherwise, will try to read from 'file', falling back to IBM Cloud and saving it into 'file' if it fails. + """ + + # TODO It starts with the same code as in loadLanguages : make it a function + + # Gets the list from a local file + if not keywords or len(keywords) == 0: + logging.debug("Reading from %s..." % file) + try: + with open(file,'r') as f: + j = json.load(f) + logging.debug("Read keyword list : %s",repr(j)) + return j + except: + raise ValueError("Could not read keywords list from %s and no keyword given" % file) + pass + + kws = [] + + for keyword in keywords: + logging.debug("Init %s...",keyword) + kws = kws + [ keyword ] + + for lang in self.languages: + # For tests, in order not to use all credits, we can limit the number of calls here + if limit and len(kws) >= limit: + break + try: + translation = self.translate( keyword, target=lang['language'] ) + translated = translation['translation'].rstrip() + logging.debug("Adding translation %s in %s for %s", translated, lang, keyword) + kws = kws + [ translated ] + except: + logging.exception("Could not translate %s into %s", keyword, repr(lang)) + pass + logging.debug("Keywords : %s", repr(kws)) + + if file: + try: + logging.debug("Saving keywords translations into %s...", file) + with open(file,'w') as f: + json.dump(kws,f) + except: + logging.exception("Could not save keywords translations into %s", file) + pass + else: + logging.debug("Not saving keywords as no file was given") + + return kws + + + def translate( self, message, target, source=None ): + """ + Translates a given message. + + target: Target language short code (e.g. 'en') + source: Source language short code ; if not given will try to guess + + Returns the plain translated message or None if no translation could be found. + """ + + # curl -X POST -u "apikey:{apikey}" --header "Content-Type: application/json" --data "{\"text\": [\"Hello, world! \", \"How are you?\"], \"model_id\":\"en-es\"}" "{url}/v3/translate?version=2018-05-01" + url = "%s/v3/translate?version=2018-05-01" % self.ibmcloud_url + body = { + "text": [message], + "target": target + } + if source: + body['source'] = source + headers = { + 'Content-Type': 'application/json', + 'Accept': 'application/json', + 'X-Watson-Learning-Opt-Out': 'true' + } + logging.debug(">>> POST %s, %s, %s",url,repr(body),repr(headers)) + r = requests.post(url, json=body, headers=headers, auth=('apikey',self.ibmcloud_apikey), timeout=TIMEOUT) + # TODO Log full response when it's usefull (i.e. when a message is going to be answered) + _logResponse(r) + if r.status_code == requests.codes.ok: + j = r.json() + translation = j['translations'] + return translation[0] + # A 404 can happen if there is no translation available + elif r.status_code == requests.codes.not_found: + return None + else: + r.raise_for_status() + + + def onMessage( self, message ): + """ + Called by self.chatter whenever a message hsa arrived : + if the given message contains any of the keywords in any language, + will answer with a translation in a random language + including the flag of the random language. + + message: A plain text message + Returns the crafted translation + """ + + # FIXME re.compile((i18n.t('Shutdown'),re.IGNORECASE).search(message) does not work + # as expected so we use re.search(...) + if re.search( self.re_shutdown, message, re.IGNORECASE ): + logging.debug("Shutdown asked") + self.chatter.stop() + + # Only if the message contains a keyword + elif re.search( self.re_keywords, message, flags=re.IGNORECASE ): + + # Selects a few random target languages each time + langs = random.choices( self.languages, k=self.tries ) + + for lang in langs: + # Gets a translation in this random language + translation = self.translate( message, target=lang['language'] ) + if translation: + translated = translation['translation'].rstrip() + try: + lang_emoji = flag.flag(lang['language']) + except ValueError: + lang_emoji= "🏳️‍🌈" + answer = "%s %s" % (translated,lang_emoji) + logging.debug(">> %s" % answer) + self.chatter.send(answer) + # Returns as soon as one translation was done + return + else: + pass + + logging.warning("Could not find a translation in %s for %s",repr(langs),message) + + else: + logging.debug("Message did not have a keyword") + + + def onExit( self ): + + sent = self.chatter.send( i18n.t('Goodbye') ) + + + def run( self ): + """ + Starts the bot : + + 1. Sends a hello message + 2. Waits for messages to translate + """ + + self.chatter.send( i18n.t('Hello') ) + self.registerExitHandler() + self.chatter.start(self) + + + +if __name__ == '__main__': + + """ + A convenient CLI to play with this bot + """ + + # + # Two-pass arguments parsing + # + + config = Config() + + parser = argparse.ArgumentParser( description="A bot that reacts to messages with given keywords by responding with a random translation" ) + # Bootstrap options + parser.add_argument("--config-file", "-c", dest="config_file", help="YAML configuration file.") + parser.add_argument("--config-dir", "-C", dest="config_dir", default=config.config_dir, help="Directory where to find configuration, cache and translation files by default.") + parser.add_argument('--verbosity', '-V', dest='verbosity', default=config.verbosity, help="Log level") + # Core arguments + parser.add_argument("--keyword", "-k", dest="keywords", action="append", help="Keyword bot should react to (will write them into the file specified with --keywords-file)") + parser.add_argument("--keywords-file", dest="keywords_file", help="File to load from and write keywords to") + parser.add_argument("--language", "-l", dest="languages", action="append", help="Target language") + parser.add_argument("--languages-file", dest="languages_file", help="File to load from and write languages to") + parser.add_argument("--shutdown", dest="shutdown", help="Shutdown keyword regular expression pattern") + parser.add_argument("--ibmcloud-url", dest="ibmcloud_url", help="IBM Cloud API base URL (get it from your resource https://cloud.ibm.com/resources)") + parser.add_argument("--ibmcloud-apikey", dest="ibmcloud_apikey", help="IBM Cloud API key (get it from your resource : https://cloud.ibm.com/resources)") + # Chatter-generic arguments + parser.add_argument("--backend", "-b", dest="backend", choices=["signal","console"], default=config.backend, help="Chat backend to use") + parser.add_argument("--input-file", "-i", dest="input_file", default=config.input_file, help="File to read messages from (one per line)") + parser.add_argument('--username', '-u', dest='username', help="Sender's number (e.g. +12345678901 for the 'signal' backend)") + parser.add_argument('--group', '-g', dest='group', help="Group's ID in base64 (e.g. 'mPC9JNVoKDGz0YeZMsbL1Q==' for the 'signal' backend)") + parser.add_argument('--recipient', '-r', dest='recipient', help="Recipient's number (e.g. +12345678901)") + # Signal-specific arguments + parser.add_argument('--signal-cli', dest='signal_cli', default=config.signal_cli, help="Path to `signal-cli` if not in PATH") + # Misc. options + parser.add_argument('--locale', '-L', dest='locale', default=config.locale, help="Change default locale (e.g. 'fr')") + + # + # 1st pass only matters for 'bootstrap' options : configuration file and logging + # + parser.parse_args(namespace=config) + + # Logging configuration + logLevel = getattr(logging, config.verbosity.upper(), None) + if not isinstance(logLevel, int): + raise ValueError('Invalid log level: %s' % config.verbosity) + # Logs are output to stderr ; stdout is reserved to print the answer(s) + logging.basicConfig(level=logLevel, stream=sys.stderr) + logging.debug( "Configuration for bootstrap : %s", repr(vars(config)) ) + + # Loads the config file that will be used to lookup some missing parameters + if not config.config_file: + config.config_file = os.path.join(config.config_dir,"config.yml") + try: + with open(config.config_file,'r') as file: + # The FullLoader parameter handles the conversion from YAML + # scalar values to Python the dictionary format + dictConfig = yaml.full_load(file) + logging.debug("Successfully loaded configuration from %s : %s" % (config.config_file,repr(dictConfig))) + config.__dict__.update(dictConfig) + except: + pass + # From here the config object has only the default values for all configuration options + #logging.debug( "Configuration after bootstrap : %s", repr(vars(config)) ) + + # + # 2nd pass parses all options + # + # Updates the existing config object with all parsed options + parser.parse_args(namespace=config) + logging.debug( "Final configuration : %s", repr(vars(config)) ) + + # + # From here the config object has default options from: + # 1. hard-coded default values + # 2. configuration file overrides + # 3. command line overrides + # + # We can check the required options that could not be checked before + # (because required arguments may have been set from the config file and not on the command line) + # + + # i18n + l10n + logging.debug("Current locale : %s"%repr(locale.getlocale())) + if not config.locale: + config.locale = locale.getlocale()[0] + # See https://pypi.org/project/python-i18n/ + # FIXME Manually sets the locale : how come a Python library named 'i18n' doesn't take into account the Python locale by default ? + i18n.set('locale',config.locale.split('_')[0]) + logging.debug("i18n locale : %s"%i18n.get('locale')) + i18n.set('filename_format', 'i18n.{locale}.{format}') # Removing the namespace from keys is simpler for us + i18n.load_path.append(config.config_dir) + + if not config.ibmcloud_url: + raise ValueError("Missing required parameter : --ibmcloud-url") + if not config.ibmcloud_apikey: + raise ValueError("Missing required parameter : --ibmcloud-apikey") + + # config.keywords is used if given + # else, check for an existing keywords_file + if not config.keywords_file: + # As a last resort, use 'keywords.json' in the config directory + config.keywords_file = os.path.join(config.config_dir,'keywords.json') + # Convenience check to better warn the user + if not config.keywords: + try: + with open(config.keywords_file,'r') as f: + pass + except: + raise ValueError("Could not open %s : please generate with --keywords first or create the file indicated with --keywords-file"%config.keywords_file) + + # config.languages is used if given + # else, check for an existing languages_file + if not config.languages_file: + # As a last resort, use 'keywords.json' in the config directory + config.languages_file = os.path.join(config.config_dir,'languages.json') + # Convenience check to better warn the user + if not config.languages: + try: + with open(config.languages_file,'r') as f: + pass + except: + raise ValueError("Could not open %s : please remove --languages to generate it automatically or create the file indicated with --languages-file"%config.languages_file) + + if not config.shutdown: + # This MUST be instanciated AFTER i18n has been configured ! + config.shutdown = i18n.t('Shutdown') + + # Creates the chat engine depending on the 'backend' parameter + if config.backend == "signal": + if not config.signal_cli: + raise ValueError("Could not find the 'signal-cli' command in PATH and no --signal-cli given") + if not config.username: + raise ValueError("Missing a username") + if not config.recipient and not config.group: + raise ValueError("Either --recipient or --group must be provided") + chatter = SignalChatter( + username=config.username, + recipient=config.recipient, + group=config.group, + signal_cli=config.signal_cli) + # By default (or if backend == "console"), will read from stdin or a given file and output to console + else: + chatter = ConsoleChatter(config.input_file,sys.stdout) + + # + # Real start + # + + TransBot( + keywords=config.keywords, keywords_file=config.keywords_file, + languages=config.languages, languages_file=config.languages_file, + ibmcloud_url=config.ibmcloud_url, ibmcloud_apikey=config.ibmcloud_apikey, + shutdown_pattern=config.shutdown, + chatter=chatter + ).run() diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..58368ff --- /dev/null +++ b/requirements.txt @@ -0,0 +1,8 @@ +# Requirements for nicobot.py +# https://requests.readthedocs.io/en/master/ +requests +# https://github.com/cvzi/flag +emoji-country-flag +python-i18n +# https://pyyaml.org/wiki/PyYAMLDocumentation +pyyaml diff --git a/test/sample-conf/config.yml b/test/sample-conf/config.yml new file mode 100644 index 0000000..216c2d8 --- /dev/null +++ b/test/sample-conf/config.yml @@ -0,0 +1,14 @@ +# IBM Cloud credentials for your 'Language Translator' service instance : get them from your +# See detailed instructions : https://cloud.ibm.com/apidocs/language-translator +ibmcloud_url: https://api.us-south.language-translator.watson.cloud.ibm.com/instances/6bbda3b3-d572-45e1-8c54-22d6ed9e52c2 +ibmcloud_apikey: "f5sAznhrKQyvBFFaZbtF60m5tzLbqWhyALQawBg5TjRI" + +backend: console +#backend: signal + +# Signal credentials +# Make sure to put quotes around the username field as it is a phone number for Signal +username: "+33123456789" +recipient: "+33123456789" +# Get this group ID with the command `signal-cli -u +33123456789 listGroups` +#group: "mABCDNVoEFGz0YeZM1234Q==" diff --git a/test/sample-conf/i18n.en.yml b/test/sample-conf/i18n.en.yml new file mode 100644 index 0000000..15ffaea --- /dev/null +++ b/test/sample-conf/i18n.en.yml @@ -0,0 +1,4 @@ +en: + Hello: 🤟 nicobot ready 🤟 + Goodbye: See you later 👋 + Shutdown: bye nicobot diff --git a/test/sample-conf/i18n.fr.yml b/test/sample-conf/i18n.fr.yml new file mode 100644 index 0000000..53d0360 --- /dev/null +++ b/test/sample-conf/i18n.fr.yml @@ -0,0 +1,4 @@ +fr: + Hello: "🤟 nicobot paré 🤟" + Goodbye: A+ 👋 + Shutdown: couché nicobot diff --git a/test/sample-conf/keywords.json b/test/sample-conf/keywords.json new file mode 100644 index 0000000..fb6a370 --- /dev/null +++ b/test/sample-conf/keywords.json @@ -0,0 +1 @@ +["bonjour", "\u0645\u0631\u062d\u0628\u0627", "\u0410\u043b\u043e?", "\u09b9\u09cd\u09af\u09be\u09b2\u09cb", "Ahoj.", "Hallo?", "Guten Tag", "\u0395\u03bc\u03c0\u03c1\u03cc\u03c2!", "Hello", "hola", "Tere.", "Hei", "Hello", "\u0ab9\u0ac7\u0ab2\u0acb\u0ab5", "\u05d4\u05dc\u05d5", "\u0928\u092e\u0938\u094d\u0915\u093e\u0930", "Zdravo.", "Hello", "Ciao", "\u30cf\u30ed\u30fc", "\uc548\ub155\ud558\uc138\uc694", "Labas.", "Sveika.", "\u0d39\u0d32\u0d4b", "Hello.", "Hello", "Hallo", "\u0939\u0947\u0932\u094b", "Hallo", "Witaj", "Ol\u00e1", "Salut", "\u0417\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439\u0442\u0435.", "\u0dc4\u0dd9\u0dbd\u0ddd", "Ahoj.", "Zdravo.", "Hej.", "\u0bb9\u0bb2\u0bcb", "\u0c39\u0c32\u0c4b", "\u0e2a\u0e27\u0e31\u0e2a\u0e14\u0e35", "Merhaba.", "\u0633\u0644\u0627\u0645", "Xin ch\u00e0o", "\u4f60\u597d", "\u4f60\u597d", "coucou", "\u0645\u0633\u062d\u0648\u0631", "\"", "\u0995\u09c1\u0981\u099a\u09be\u09a8\u09cb", "scelov\u00e1no", "couched", "Couched", "- ...", "couched", "conectado", "kubitud", "Coutettava", "Couch\u00e9", "@ info", "\u0a9c\u0acb\u0aa1\u0ac7\u0ab2", "\u05de\u05e6\u05e8\u05e3", "\u0926\u093f\u092f\u093e \u0939\u0941\u0906", "spojeno", "Kusz\u00e1lt", "accoppiata", "\u30af\u30c1\u30c9", "\ucfe8\ud558\uac8c", "kuita", "1.", "\u0d15\u0d1a\u0d4d\u0d1a\u0d35\u0d1f\u0d02", "Kuy", "kuffar", "hovet", "\u091c\u094b\u0930\u0926\u093e\u0930", "Couches", "przewr\u00f3cony", "cucut\u0103", "\u041a\u0430\u0448\u0435\u043b\u044c", "\u0d9a\u0daf\u0dc0\u0dd4\u0dbb", "pre\u0165ahovan\u00e9", "couched", "couched", "\u0b95\u0bc2\u0bae\u0bcd\u0baa\u0bc1", "\u0c15\u0c42\u0c30\u0c4d\u0c2a\u0c41", "\u0e16\u0e39\u0e01\u0e1e\u0e31\u0e01\u0e44\u0e27\u0e49", "kanepeden", "\u06a9\u0648\u062a", "C\u00f3", "\u5e93\u7126", "\u9999", "salut", "hello"] diff --git a/test/sample-conf/languages.json b/test/sample-conf/languages.json new file mode 100644 index 0000000..2490348 --- /dev/null +++ b/test/sample-conf/languages.json @@ -0,0 +1,228 @@ +{ + "languages" : [ { + "language" : "af", + "name" : "Afrikaans" + }, { + "language" : "ar", + "name" : "Arabic" + }, { + "language" : "az", + "name" : "Azerbaijani" + }, { + "language" : "ba", + "name" : "Bashkir" + }, { + "language" : "be", + "name" : "Belarusian" + }, { + "language" : "bg", + "name" : "Bulgarian" + }, { + "language" : "bn", + "name" : "Bengali" + }, { + "language" : "ca", + "name" : "Catalan" + }, { + "language" : "cs", + "name" : "Czech" + }, { + "language" : "cv", + "name" : "Chuvash" + }, { + "language" : "da", + "name" : "Danish" + }, { + "language" : "de", + "name" : "German" + }, { + "language" : "el", + "name" : "Greek" + }, { + "language" : "en", + "name" : "English" + }, { + "language" : "eo", + "name" : "Esperanto" + }, { + "language" : "es", + "name" : "Spanish" + }, { + "language" : "et", + "name" : "Estonian" + }, { + "language" : "eu", + "name" : "Basque" + }, { + "language" : "fa", + "name" : "Persian" + }, { + "language" : "fi", + "name" : "Finnish" + }, { + "language" : "fr", + "name" : "French" + }, { + "language" : "ga", + "name" : "Irish" + }, { + "language" : "gu", + "name" : "Gujarati" + }, { + "language" : "he", + "name" : "Hebrew" + }, { + "language" : "hi", + "name" : "Hindi" + }, { + "language" : "hr", + "name" : "Croatian" + }, { + "language" : "ht", + "name" : "Haitian" + }, { + "language" : "hu", + "name" : "Hungarian" + }, { + "language" : "hy", + "name" : "Armenian" + }, { + "language" : "is", + "name" : "Icelandic" + }, { + "language" : "it", + "name" : "Italian" + }, { + "language" : "ja", + "name" : "Japanese" + }, { + "language" : "ka", + "name" : "Georgian" + }, { + "language" : "kk", + "name" : "Kazakh" + }, { + "language" : "km", + "name" : "Central Khmer" + }, { + "language" : "ko", + "name" : "Korean" + }, { + "language" : "ku", + "name" : "Kurdish" + }, { + "language" : "ky", + "name" : "Kirghiz" + }, { + "language" : "lo", + "name" : "Lao" + }, { + "language" : "lt", + "name" : "Lithuanian" + }, { + "language" : "lv", + "name" : "Latvian" + }, { + "language" : "ml", + "name" : "Malayalam" + }, { + "language" : "mn", + "name" : "Mongolian" + }, { + "language" : "mr", + "name" : "Marathi" + }, { + "language" : "ms", + "name" : "Malay" + }, { + "language" : "mt", + "name" : "Maltese" + }, { + "language" : "my", + "name" : "Burmese" + }, { + "language" : "nb", + "name" : "Norwegian Bokmal" + }, { + "language" : "ne", + "name" : "Nepali" + }, { + "language" : "nl", + "name" : "Dutch" + }, { + "language" : "nn", + "name" : "Norwegian Nynorsk" + }, { + "language" : "pa", + "name" : "Panjabi" + }, { + "language" : "pa-PK", + "name" : "Panjabi (Shahmukhi script, Pakistan)" + }, { + "language" : "pl", + "name" : "Polish" + }, { + "language" : "ps", + "name" : "Pushto" + }, { + "language" : "pt", + "name" : "Portuguese" + }, { + "language" : "ro", + "name" : "Romanian" + }, { + "language" : "ru", + "name" : "Russian" + }, { + "language" : "si", + "name" : "Sinhala" + }, { + "language" : "sk", + "name" : "Slovakian" + }, { + "language" : "sl", + "name" : "Slovenian" + }, { + "language" : "so", + "name" : "Somali" + }, { + "language" : "sq", + "name" : "Albanian" + }, { + "language" : "sr", + "name" : "Serbian" + }, { + "language" : "sv", + "name" : "Swedish" + }, { + "language" : "ta", + "name" : "Tamil" + }, { + "language" : "te", + "name" : "Telugu" + }, { + "language" : "th", + "name" : "Thai" + }, { + "language" : "tl", + "name" : "Tagalog" + }, { + "language" : "tr", + "name" : "Turkish" + }, { + "language" : "uk", + "name" : "Ukrainian" + }, { + "language" : "ur", + "name" : "Urdu" + }, { + "language" : "vi", + "name" : "Vietnamese" + }, { + "language" : "zh", + "name" : "Simplified Chinese" + }, { + "language" : "zh-TW", + "name" : "Traditional Chinese" + } ] +} \ No newline at end of file