mirror of https://github.com/nicolabs/nicobot.git synced 2025-09-07 01:40:41 +02:00

Mirror of github.com/nicolabs/nicobot

Go to file

nicobo 7a95ea8c9e - remove arm/v5 & arm/v6 for now in order to focus on fewer platforms to debug the build process		2021-01-18 22:39:35 +01:00
.github	- remove arm/v5 & arm/v6 for now in order to focus on fewer platforms to debug the build process	2021-01-18 22:39:35 +01:00
docker	+ smaller images that should work on more CPU architectures	2021-01-17 23:40:30 +01:00
nicobot	fixed : make sure python3 is used (not python 2)	2020-12-20 15:46:17 +01:00
tests	+ test on max_count	2020-05-26 22:37:08 +02:00
.gitignore
.travis.yml	PyYAML requires Python 3.5+	2020-12-20 22:42:04 +01:00
alpine.Dockerfile	- rustup does not exit for arm on alpine	2021-01-18 08:03:07 +01:00
debian-signal.Dockerfile	+ smaller images that should work on more CPU architectures	2021-01-17 23:40:30 +01:00
debian.Dockerfile	+ smaller images that should work on more CPU architectures	2021-01-17 23:40:30 +01:00
Dockerfile-alpine	trying arm64 on alpine...	2021-01-04 00:35:31 +01:00
LICENSE
README.md	re-read & re-fixed the readme	2021-01-03 22:59:05 +01:00
requirements-build.txt	requirements.txt split in build & runtime to make it clear	2020-12-20 15:43:38 +01:00
requirements-runtime.txt	requirements.txt split in build & runtime to make it clear	2020-12-20 15:43:38 +01:00
setup.py	PyYAML requires Python 3.5+	2020-12-20 22:42:04 +01:00

README.md

nicobot

Python package :

Docker :

About

A collection of 🤟 cool 🤟 chat bots :

Transbot is a demo chatbot interface to IBM Watson™ Language Translator service
Askbot is a one-shot chatbot that will send a message and wait for an answer

⚠️ My bots are cool, but they are absolutely EXPERIMENTAL use them at your own risk !

This project features :

Participating in Signal conversations
Participating in XMPP / Jabber conversations
Using IBM Watson™ Language Translator cloud API

Requirements & installation

The bots can be installed and run at your choice from :

the Python package
the source code
the Docker images

Python package installation

A classic (Python package) installation requires :

Python 3 (>= 3.5) and pip (should be bundled with Python) ; e.g. on Debian : sudo apt install python3 python3-pip
signal-cli for the Signal backend (see [Using the Signal backend] below for requirements)
For transbot : an IBM Cloud account (free account ok)

To install, simply do :

pip3 install nicobot

Then, you can run the bots by their name, thanks to the provided commands :

# Runs the 'transbot' bot
transbot [options...]
# Runs the 'askbot' bot
askbot [options...]

Installation from source

To install from source you need to fulfill the same requirements as for a package installation (see above), then download the code and build it :

git clone https://github.com/nicolabs/nicobot.git
cd nicobot
pip3 install -r requirements-runtime.txt

Now you can run the bots by their name as if they were installed via the package :

# Runs the 'transbot' bot
transbot [options...]
# Runs the 'askbot' bot
askbot [options...]

Docker usage

At the present time there are several Docker images available, with the following tags :

debian : if you have several images with the debian base, this may be the most space-efficient (as base layers will be shared with other images)
debian-slim : if you want a smaller-sized image and you don't run other images based on the debian image (as it will not share as much layers as with the above debian tag)
alpine : this should be the smallest image in theory, but it's more complex to maintain and thereore might not meet this expectation ; please check/test before use

The current state of those images is such that I suggest you try the debian-slim image first and switch to another one if you encounter issues or have a specific use case to solve.

The container is invoked this way :

docker ... [--signal-register] <bot name> <bot arguments>

--signal-register will display a QR code in the console : scan it with the Signal app on the device to link the bot with (it will simply do the link command inside the container ; read more about this later in this document). If this option is not given and the signal backend is used, it will use the .local/share/signal-cli directory from the container or fail.
<bot name> is either transbot or askbot
<bot arguments> is the list of arguments to pass to the bot

Sample command to start a container :

docker run --rm -it -v "myconfdir:/etc/nicobot" nicolabs/nicobot:debian-slim transbot -C /etc/nicobot

In this example myconfdir is a local directory with configuration files for the bot (-C option), but you could set all arguments on the command line if you don't want to deal with files.

You can also use docker volumes to persist signal and IBM Cloud credentials and configuration :

docker run --rm -it -v "myconfdir:/etc/nicobot" -v "$HOME/.local/share/signal-cli:/root/.local/share/signal-cli" nicolabs/nicobot:debian-slim transbot -C /etc/nicobot

All options that can be passed to the bots' command line can also be passed to the docker command line.

Transbot instructions

Transbot is a demo chatbot interface to IBM Watson™ Language Translator service.

Again, this is NOT STABLE code, there is absolutely no warranty it will work or not harm butterflies on the other side of the world... Use it at your own risk !

It detects configured patterns or keywords in messages (either received directly or from a group chat) and answers with a translation of the given text.

The sample configuration in tests/transbot-sample-conf, demoes how to make the bot answer messages given in the form nicobot <text_to_translate> in <language> (or simply nicobot <text_to_translate>, into the current language) with a translation of <text_to_translate>.

Transbot can also pick a random language to translate into ; the sample configuration file shows how to make it translate messages containing "Hello" or "Goodbye" into a random language.

Quick start

Install nicobot (see above)
Create a Language Translator service instance on IBM Cloud and get the URL and API key from your console
Fill them into tests/transbot-sample-conf/config.yml (ibmcloud_url and ibmcloud_apikey)
Run transbot -C tests/transbot-sample-conf (with docker it will be something like docker run -it "tests/transbot-sample-conf:/etc/nicobot" nicolabs/nicobot:debian-slim transbot -C /etc/nicobot)
Type Hello world in the console : the bot will print a random translation of "Hello World"
Type Bye nicobot : the bot will terminate

You may now explore the dedicated chapters below for more options, including sending & receiving messages through XMPP or Signal instead of keyboard & console.

Main configuration options and files

This paragraph introduces the most important options to make this bot work. Please also check the generic options below, and finally run transbot -h to get an exact list of all options.

The bot needs several configuration files that will be generated / downloaded the first time if not provided :

--keyword and --keywords-file will help you generate the list of keywords that will trigger the bot. To do this, run transbot --keyword <a_keyword> --keyword <another_keyword> ... a first time : this will download all known translations for these keywords and save them into a keywords.json file. Next time you run the bot, don't use the --keyword option : it will reuse this saved keywords list. You can use --keywords-file to change the file name.
--languages-file : The first time the bot runs it will download the list of supported languages into languages.<locale>.json and reuse it afterwards. You can edit it, to keep just the set of languages you want for instance. You can also use the --locale option to indicate the desired locale.
--locale will select the locale to use for default translations (with no target language specified) and as the default parsing language for keywords.
--ibmcloud-url and --ibmcloud-apikey take arguments you can obtain from your IBM Cloud account (create a Language Translator instance then go to the resource list)

The i18n.<locale>.yml file contains localization strings for your locale :

Transbot will say "Hello" when started and "Goodbye" before shutting down : you can configure those banners in this file.
It also defines the pattern that terminates the bot.

A sample configuration is available in the tests/transbot-sample-conf/ directory.

Askbot instructions

Askbot is a one-shot chatbot that will send a message and wait for an answer.

Again, this is NOT STABLE code, there is absolutely no warranty it will work or not harm butterflies on the other side of the world... Use it at your own risk !

When run, it will send a message and wait for an answer, in different ways (see options below). Once the configured conditions are met, the bot will terminate and print the result in JSON format. This JSON structure will have to be parsed in order to retrieve the answer and determine what were the exit(s) condition(s).

Main configuration options

Run askbot -h to get a description of all options.

Below are the most important configuration options for this bot (please also check the generic options below) :

--max-count will define how many messages to read at maximum before exiting. This allows the recipient to split the answer in several messages for instance. However currently all messages are returned by the bot at once at the end, so they cannot be parsed on the fly by an external program. To give x tries to the recipient, run x times this bot instead.
--pattern defines a pattern that will end the bot when matched. This is the way to detect an answer. It takes 2 arguments : a symbolic name and a regular expression pattern that will be tested against each message. It can be passed several times in the same command line, hence the <name> argument, which will allow identifying which pattern(s) matched.

Sample configuration can be found in tests/askbot-sample-conf.

Example

askbot -m "Do you like me ?" -p yes '(?i)\b(yes|ok)\b' -p no '(?i)\bno\b' -p cancel '(?i)\b(cancel|abort)\b' --max-count 3 -b signal -U '+33123456789' --recipient '+34987654321'

The previous command will :

Send the message "Do you like me" to +34987654321 on Signal
Wait for a maximum of 3 messages in answer and return
Or return immediately if a message matches one of the given patterns labeled 'yes', 'no' or 'cancel'

If the user +34987654321 was to reply :

I don't know
Ok then : NO !

Then the output would be :

{
    "max_responses": false,
    "messages": [{
        "message": "I don't know...",
        "patterns": [{
            "name": "yes",
            "pattern": "(?i)\\b(yes|ok)\\b",
            "matched": false
        }, {
            "name": "no",
            "pattern": "(?i)\\bno\\b",
            "matched": false
        }, {
            "name": "cancel",
            "pattern": "(?i)\\b(cancel|abort)\\b",
            "matched": false
        }]
    }, {
        "message": "Ok then : NO !",
        "patterns": [{
            "name": "yes",
            "pattern": "(?i)\\b(yes|ok)\\b",
            "matched": true
        }, {
            "name": "no",
            "pattern": "(?i)\\bno\\b",
            "matched": true
        }, {
            "name": "cancel",
            "pattern": "(?i)\\b(cancel|abort)\\b",
            "matched": false
        }]
    }]
}

A few notes about the regex usage in this example : in -p yes '(?i)\b(yes|ok)\b' :

(?i) enables case-insensitive match
\b means "edge of a word" ; it is used to make sure the wanted text will not be part of another word (e.g. tik tok would match ok otherwise)
Note that a search is done on the messages (not a match) so it is not required to specify a full regular expression with ^ and $ (though you may do, if you want to). This makes the pattern more readable.
The pattern is labeled 'yes' so it can be easily identified in the JSON output and counted as a positive match

You may also have noticed the importance of defining patterns that don't overlap (here the message matched both 'yes' and 'no') or being ready to handle unknow states.

To make use of the bot, you could parse its output with a script, or with a command-line client like jq.

This sample Python snippet will get the name of the matched patterns :

# loads the JSON output
output = json.loads('{ "max_responses": false, "messages": [...] }')
# 'matched' is the list of the names of the patterns that matched against the last message, e.g. `['yes','no']`
matched = [ p['name'] for p in output['messages'][-1]['patterns'] if p['matched'] ]

Generic instructions

Common options

The following options are common to both bots :

--config-file and --config-dir let you change the default configuration directory and file. All configuration files will be looked up from this directory ; --config-file allows overriding the location of config.yml.
--backend selects the chatter system to use : it currently supports "console", "signal" and "jabber" (see below)
--stealth will make the bot connect and listen to messages but print answers to the console instead of sending it ; useful to observe the bot's behavior in a real chatroom...

Configuration file : config.yml

Options can also be taken from a configuration file. By default it reads the config.yml file in the current directory but can be changed with the --config-file and --config-dir options.

This file is in YAML format with all options at root level. Keys are named after the command line options, with middle dashes - replaced with underscores _ and a s appended for lists (option --ibmcloud-url https://api... will become ibmcloud_url: https://api... and --keywords-file 1.json --keywords-file 2.json will become :

keywords_files:
    - 1.json
    - 2.json

See also sample configurations in the tests/ directory.

If unsure, please first review YAML syntax as it has a few traps.

Using the Jabber/XMPP backend

By using --backend jabber you can make the bot chat with XMPP (a.k.a. Jabber) users.

Jabber-specific options

--jabber-username and --jabber-password are the JabberID (e.g. myusername@myserver.im) and password of the bot's account used to send and read messages. If --jabber-username missing, --username will be used.
--jabber-recipient is the JabberID of the person to send the message to. If missing, --recipient will be used.

Example

transbot -C tests/transbot-sample-conf -b jabber -U mybot@myserver.im -r me@myserver.im`

With :

-b jabber to select the XMPP/Jabber backend
-U mybot@myserver.im the JabberID of the bot
-r me@myserver.im the JabberID of the correspondent

Using the Signal backend

By using --backend signal you can make the bot chat with Signal users.

Prerequistes

For package or source installations, you must first install and configure signal-cli.

For all installations, you must register or link the computer where the bot will run ; e.g. :

signal-cli link --name MyComputer

With docker images you can do this registration by using the --signal-register option. This will save the registration files into /root/.local/share/signal-cli/ inside the container. If this location links to a persisted volume, it will be reused on each launch.

Please see signal-cli's man page for more details on the registration process.

Signal-specific options

--signal-username selects the account to use to send and read message : it is a phone number in international format (e.g. +33123456789). In config.yml, make sure to put quotes around it to prevent YAML thinking it's an integer (because of the 'plus' sign). If missing, --username will be used.
--signal-recipient and --signal-group select the recipient (only one of them should be given). Make sure --signal-recipient is in international phone number format and --signal-group is a base 64 group ID (e.g. --signal-group "mABCDNVoEFGz0YeZM1234Q=="). If --signal-recipient is missing, --recipient will be used. To get the IDs of the groups you are in, run : signal-cli -U +336123456789 listGroups

Example :

transbot -b signal -U +33612345678 -g "mABCDNVoEFGz0YeZM1234Q==" --ibmcloud-url https://api.eu-de.language-translator.watson.cloud.ibm.com/instances/a234567f-4321-abcd-efgh-1234abcd7890 --ibmcloud-apikey "f5sAznhrKQyvBFFaZbtF60m5tzLbqWhyALQawBg5TjRI"

Development

Install Python dependencies (both for building and running) with :

pip3 install -r requirements-build.txt -r requirements-runtime.txt

To run unit tests :

python3 -m unittest discover -v -s tests

To run directly from source (without packaging) :

python3 -m nicobot.askbot [options...]

To build locally (more at pypi.org) :

python3 setup.py sdist bdist_wheel

To upload to test.pypi.org :

# Defines username and password (or '__token__' and API key) ; alternatively CLI `-u` and `-p` options or user input may be used (or even certificates, see `python3 -m twine upload --help`)
TWINE_USERNAME=__token__
TWINE_PASSWORD=`pass pypi/test.pypi.org/api_token | head -1`
python3 -m twine upload --repository testpypi dist/*

To upload to PROD pypi.org :

TODO

Otherwise, it is automatically tested, built and uploaded to pypi.org using Travis CI on each push to GitHub.

Docker build

There are several Dockerfiles, each made for specific use cases (see Docker-usage above) :

Dockerfile-debian and Dockerfile-debian-slim are quite straight and very similar. They still require multi-stage build to address enough platforms.

Dockerfile-alpine requires a multi-stage build anyway because most of the Python dependencies need to be compiled first. The result however should be a far smaller image than with a Debian base.

Note that the signal-cli backend needs a Java runtime environment, and also rust dependencies to support Signal's group V2. This currently doubles the size of the images and ruins the advantage of alpine over debian...

Those images are limited to CPU architectures :

supported by the base images
for which the Python dependencies are built or able to build
for which the native dependencies of signal (libzkgroup) can be built (alpine only)

Simple build command (single architecture) :

docker build -t nicolabs/nicobot:debian-slim -f Dockerfile-debian-slim .

Sample buildx command (multi-arch) :

docker buildx build --platform linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64,linux/ppc64le,linux/s390x -t nicolabs/nicobot:debian-slim -f Dockerfile-debian-slim .

Then run with the provided sample configuration :

docker run --rm -it -v "$(pwd)/tests:/etc/nicobot" nicolabs/nicobot:debian-slim askbot -c /etc/nicobot/askbot-sample-conf/config.yml

Github actions are currently used (see dockerhub.yml to automatically build and push the images to Docker Hub so they are available whenever commits are pushed to the master branch.

The images have all the bots inside, as they only differ by one script from each other. The docker-entrypoint.sh script takes the name of the bot to invoke as its first argument, then its own options and finally the bot's arguments.

Versioning

The command-line option to display the scripts' version relies on setuptools_scm, which extracts it from the underlying git metadata. This is convenient because the developer does not have to manually update the version (or forget to do it prior a release), however it either requires the version to be fixed inside a package or the .git directory to be present.

There were several options among which the following one has been retained :

Running setup.py creates / updates the version inside the version.py file
The scripts then load this module at runtime

The remaining requirement is that setup.py must be run before the version can be extracted. In exchange :

it does not require setuptools nor git at runtime
it frees us from having the .git directory around at runtime ; this is especially useful to make the docker images smaller

Resources

IBM Cloud

Signal

Jabber

Official XMPP libraries : https://xmpp.org/software/libraries.html
OMEMO compatible clients : https://omemo.top/
OMEMO official Python library : looks very immature
Gaijim, a Windows/MacOS/Linux XMPP client with OMEMO support : gajim.org | dev.gajim.org/gajim
Conversations, an Android XMPP client with OMEMO support and paid hosting : https://conversations.im

Python libraries :

xmpppy : this library is very easy to use but it does allow easy access to thread or timestamp, and no OMEMO...
github.com/horazont/aioxmpp : officially referenced library from xmpp.org, seems the most complete but misses practical introduction and does not provide OMEMO OOTB.
slixmpp : seems like a cool library too and pretends to require minimal dependencies ; plus it supports OMEMO so it's the winner. API doc.