Merge pull request #23 from nicolabs/askbot

Askbot
This commit is contained in:
nicobo 2020-05-20 07:37:01 +02:00 committed by GitHub
commit afd716e00d
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 442 additions and 19 deletions

126
README.md
View file

@ -2,9 +2,9 @@
🤟 A collection of *cool* chat bots 🤟
> Well, there's only one of them as of today... Plus it's absolutely **EXPERIMENTAL** : use it at your own risk !
> My bots are cool, but they are absolutely **EXPERIMENTAL** use them at your own risk !
It features :
This project features :
- Participating in [Signal](https://www.signal.org/fr/) conversations
- Using [IBM Watson™ Language Translator](https://cloud.ibm.com/apidocs/language-translator) cloud API
@ -16,13 +16,14 @@ Requires :
- Python 3 (>= 3.4.2)
- [signal-cli](https://github.com/AsamK/signal-cli) (for the *Signal* backend)
- An IBM Cloud account ([free account ok](https://www.ibm.com/cloud/free))
- For *transbot* : an IBM Cloud account ([free account ok](https://www.ibm.com/cloud/free))
Install Python dependencies with :
pip3 install -r requirements.txt
See below for Signal requirements.
See below for *Signal* requirements.
## Transbot
@ -38,7 +39,7 @@ The sample configuration shows how to make it translate any message containing "
### Quick start
1. Install prerequistes ; for Debian systems this will look like :
1. Install prerequisites ; for Debian systems this will look like :
```
sudo apt install python3 python3-pip
git clone https://github.com/nicolabs/nicobot.git
@ -54,7 +55,7 @@ The sample configuration shows how to make it translate any message containing "
If you want to send & receive messages through *Signal* instead of reading from the keyboard & printing to the console :
1. Install and configure `signal-cli` (see below for details)
2. Run `python3 nicobot/transbot.py -C test/transbot-sample-conf -b signal -u '+33123456789' -r '+34987654321'` with `-u +33123456789` your *Signal* number and `-r +33987654321` the one of the person you want to make the bot chat with
2. Run `python3 nicobot/transbot.py -C test/transbot-sample-conf -b signal -U '+33123456789' -r '+34987654321'` with `-U +33123456789` your *Signal* number and `-r +33987654321` the one of the person you want to make the bot chat with
See below for more options...
@ -63,23 +64,118 @@ See below for more options...
Run `transbot.py -h` to get a description of all options.
Below are the most important configuration options :
Below are the most important configuration options for this bot (please also check the generic options below) :
- **--config-file** and **--config-dir** let you change the default configuration directory and file. All configuration files will be looked up from this directory ; `--config-file` allows overriding the location of `config.yml`.
- **--keyword** and **--keywords-file** will help you generate the list of keywords that will trigger the bot. To do this, run `transbot.py --keyword <a_keyword> --keyword <another_keyword> ...` a **first time with** : this will download all known translations for these keywords and save them into a `keywords.json` file. Next time you run the bot, **don't** use the `--keyword` option : it will reuse this saved keywords list. You can use `--keywords-file` to change the default name.
- **--languages-file** : The first time the bot runs, it will download the list of supported languages into `languages.<locale>.json` and reuse it afterwards but you can give it a specific file with the set of languages you want. You can use `--locale` to set the desired locale.
- **--locale** will select the locale to use for default translations (with no target language specified) and as the default parsing language for keywords.
- **--ibmcloud-url** and **--ibmcloud-apikey** can be obtained from your IBM Cloud account ([create a Language Translator instance](https://cloud.ibm.com/apidocs/language-translator) then go to [the resource list](https://cloud.ibm.com/resources?groups=resource-instance))
- **--backend** selects the *chatter* system to use : it currently supports "console" and "signal" (see below)
- **--username** selects the account to use to send and read message ; its format depends on the backend
- **--recipient** and **--group** select the recipient (only one of them should be given) ; its format depends on the backend
- **--stealth** will make the bot connect and listen to messages but print any answer instead of sending it ; useful to observe the bot's behavior in a real chatroom...
The **i18n.\<locale>.yml** file contains localization strings for your locale and fun :
- *Transbot* will say "Hello" when started and "Goodbye" before shutting down : you can configure those banners in this file.
- It also defines the pattern that terminates the bot.
Finally, see the next chapter to learn about the **config.yml** file.
## Askbot
*Askbot* is a one-shot chatbot that will throw a question and waits for an answer.
**Again, this is NOT STABLE code, there is absolutely no warranty it will work or not harm butterflies on the other side of the world... Use it at your own risk !**
When run, it will send a message if provided and wait for an answer, in different ways (see options below).
Once the conditions are met to terminate, the bot will print the result in [JSON](https://www.json.org/) format.
The caller will have to parse this JSON structure in order to know what the answer was and what were the exit(s) condition(s).
### Main configuration options
Run `askbot.py -h` to get a description of all options.
Below are the most important configuration options for this bot (please also check the generic options below) :
- **--max-count <integer>** will define how many messages to read at maximum before exiting. This allows the recipient to send several messages in answer, but currently all of those messages are only returned at once after they all have been typed so they cannot be parsed one by one. To give x tries to the recipient, run x times this bot instead.
- **--pattern <name> <pattern>** defines a pattern that make the bot quit when matched. It is a [regular expression pattern](https://docs.python.org/3/howto/regex.html#regex-howto). It can be given several times, hence the `<name>` field that will allow identifying which pattern(s) matched.
Sample configuration can be found in `test/askbot-sample-conf`.
### Example
The following command will :
- Send the message "Do you like me" to +34987654321 on Signal
- Wait for a maximum of 3 messages in answer and return
- Return immediately if one message matches one of the given patterns labeled 'yes', 'no' or 'cancel'
python3 askbot.py -m "Do you like me ?" -p yes '(?i)\b(yes|ok)\b' -p no '(?i)\bno\b' -p cancel '(?i)\b(cancel|abort)\b' --max-count 3 -b signal -U '+33123456789' --recipient '+34987654321'
If the user *+34987654321* would reply "I don't know" then "Ok then : NO !", the output would be :
```json
{
"max_responses": false,
"messages": [{
"message": "I don't know...",
"patterns": [{
"name": "yes",
"pattern": "(?i)\\b(yes|ok)\\b",
"matched": false
}, {
"name": "no",
"pattern": "(?i)\\bno\\b",
"matched": false
}, {
"name": "cancel",
"pattern": "(?i)\\b(cancel|abort)\\b",
"matched": false
}]
}, {
"message": "Ok then : NO !",
"patterns": [{
"name": "yes",
"pattern": "(?i)\\b(yes|ok)\\b",
"matched": true
}, {
"name": "no",
"pattern": "(?i)\\bno\\b",
"matched": true
}, {
"name": "cancel",
"pattern": "(?i)\\b(cancel|abort)\\b",
"matched": false
}]
}]
}
```
A few notes about the example : in `-p yes '(?i)\b(yes|ok)\b'` :
- `(?i)` enables case-insensitive match
- `\b` means "edge of a word" ; it is used to make sure the wanted text will not be part of another word (e.g. `tik tok` would match `ok` otherwise)
- note that no `^` nor `$` are used (though they could) in order to simplify the expression and avoid putting `.*` before and after to allow any word before and after
- the pattern is labeled 'yes' so it can easily be identified in the JSON output and checked for a positive match
Also you can notice that it's important either to define patterns that don't overlap (here the message match both 'yes' and 'no') or to be ready to handle unknow states.
You could parse the output with a script, or with a command-line client like [jq](https://stedolan.github.io/jq/).
For instance, to get the name of the matched patterns in Python :
```python
output = json.loads('{ "max_responses": false, "messages": [...] }')
matched = [ p['name'] for p in output['messages'][-1]['patterns'] if p['matched'] ]
```
It will return the list of the names of the patterns that matched the last message ; e.g. `['yes','no']` in our above example.
## Generic instructions
### Main generic options
- **--config-file** and **--config-dir** let you change the default configuration directory and file. All configuration files will be looked up from this directory ; `--config-file` allows overriding the location of `config.yml`.
- **--backend** selects the *chatter* system to use : it currently supports "console" and "signal" (see below)
- **--username** selects the account to use to send and read message ; its format depends on the backend
- **--recipient** and **--group** select the recipient (only one of them should be given) ; its format depends on the backend
- **--stealth** will make the bot connect and listen to messages but print any answer instead of sending it ; useful to observe the bot's behavior in a real chatroom...
### Config.yml configuration file
@ -117,11 +213,11 @@ Please see the [man page](https://github.com/AsamK/signal-cli/blob/master/man/si
With signal, make sure :
- the `--username` parameter is your phone number in international format (e.g. `+33123456789`). In `config.yml`, make sure to put quotes around it to prevent YAML thinking it's an integer (because of the 'plus' sign)
- specify either `--recipient` as an international phone number or `--group` with a base 64 group ID (e.g. `--group "mABCDNVoEFGz0YeZM1234Q=="`). Once registered with Signal, you can list the IDs of the groups you are in with `signal-cli -u +336123456789 listGroups`
- specify either `--recipient` as an international phone number or `--group` with a base 64 group ID (e.g. `--group "mABCDNVoEFGz0YeZM1234Q=="`). Once registered with Signal, you can list the IDs of the groups you are in with `signal-cli -U +336123456789 listGroups`
Sample command line to run the bot with Signal :
python3 nicobot/transbot.py -b signal -u +33612345678 -g "mABCDNVoEFGz0YeZM1234Q==" --ibmcloud-url https://api.eu-de.language-translator.watson.cloud.ibm.com/instances/a234567f-4321-abcd-efgh-1234abcd7890 --ibmcloud-apikey "f5sAznhrKQyvBFFaZbtF60m5tzLbqWhyALQawBg5TjRI"
python3 nicobot/transbot.py -b signal -U +33612345678 -g "mABCDNVoEFGz0YeZM1234Q==" --ibmcloud-url https://api.eu-de.language-translator.watson.cloud.ibm.com/instances/a234567f-4321-abcd-efgh-1234abcd7890 --ibmcloud-apikey "f5sAznhrKQyvBFFaZbtF60m5tzLbqWhyALQawBg5TjRI"

282
nicobot/askbot.py Normal file
View file

@ -0,0 +1,282 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import argparse
import logging
import sys
import os
import shutil
import json
import i18n
import re
import locale
import requests
import random
# Provides an easy way to get the unicode sequence for country flags
import flag
import yaml
import urllib.request
# Own classes
from helpers import *
from bot import Bot
from console import ConsoleChatter
from signalcli import SignalChatter
from stealth import StealthChatter
# Default configuration (some defaults still need to be set up after command line has been parsed)
class Config:
def __init__(self):
self.__dict__.update({
'backend': "console",
'config_file': "config.yml",
'config_dir': os.getcwd(),
'group': None,
'input_file': sys.stdin,
'max_count': -1,
'patterns': [],
'recipient': None,
'signal_cli': shutil.which("signal-cli"),
'signal_stealth': False,
'stealth': False,
'timeout': None,
'username': None,
'verbosity': "INFO",
})
class Status:
def __init__(self):
self.__dict__.update({
'max_count': False,
'messages': [],
})
class AskBot(Bot):
"""
Sends a message and reads the answer.
Can be configured with retries, pattern matching, ...
patterns : a list of 2-element list/tuple as [name,pattern]
"""
def __init__( self, chatter, message, output=sys.stdout, err=sys.stderr, patterns=[], max_count=-1 ):
# TODO Implement a global session timeout after which the bot exits
self.status = Status()
self.responses_count = 0
self.chatter = chatter
self.message = message
self.output = output
self.err = err
self.max_count = max_count
self.patterns = []
for pattern in patterns:
self.patterns.append({ 'name':pattern[0], 'pattern':re.compile(pattern[1]) })
def onMessage( self, message ):
"""
Called by self.chatter whenever a message has arrived.
message: A plain text message
Returns the full status with exit conditions
"""
status_message = { 'message':message, 'patterns':[] }
self.status.messages.append(status_message)
self.responses_count = self.responses_count + 1
logging.info("<<< %s", message)
# If we reached the last message or if we exceeded it (possible if we received several answers in a batch)
if self.max_count>0 and self.responses_count >= self.max_count:
logging.debug("Max amount of messages reached")
self.status.max_count = True
# Another way to quit : pattern matching
for p in self.patterns:
name = p['name']
pattern = p['pattern']
status_pattern = { 'name':name, 'pattern':pattern.pattern, 'matched':False }
status_message['patterns'].append(status_pattern)
if pattern.search(message):
logging.debug("Pattern '%s' matched",name)
status_pattern['matched'] = True
matched = [ p for p in status_message['patterns'] if p['matched'] ]
# Check if any exit condition is met to notify the underlying chatter engine
if self.status.max_count or len(matched) > 0:
logging.debug("At least one pattern matched : exiting...")
self.chatter.stop()
def run( self ):
"""
Starts the bot :
1. Sends the given message(s)
2. Reads and print maximum 'attempts' messages
Returns the execution status of this bot
"""
logging.debug("Bot ready.")
self.registerExitHandler()
if self.message:
self.chatter.send(self.message)
# Blocks on this line until the bot exits
logging.debug("Bot reading answer...")
self.chatter.start(self)
logging.debug("Bot done.")
return self.status
if __name__ == '__main__':
"""
A convenient CLI to play with this bot.
Arguments are compatible with https://github.com/xmpppy/xmpppy/blob/master/xmpp/cli.py and `$HOME/.xtalk`
but new ones are added.
TODO Put generic arguments in bot.py and inherit from it (should probably provide a parent ArgumentParser)
"""
#
# Two-pass arguments parsing
#
# config is the final, merged configuration
config = Config()
parser = argparse.ArgumentParser( description='Sends a XMPP message and reads the answer', formatter_class=argparse.ArgumentDefaultsHelpFormatter )
# Bootstrap options
parser.add_argument("--config-file", "-c", "--config", dest="config_file", default=config.config_file, help="YAML configuration file.")
parser.add_argument("--config-dir", "-C", dest="config_dir", default=config.config_dir, help="Directory where to find configuration files by default.")
parser.add_argument('--verbosity', '-V', dest='verbosity', default=config.verbosity, help="Log level")
# Chatter-generic arguments
parser.add_argument("--backend", "-b", dest="backend", choices=["signal","console"], default=config.backend, help="Chat backend to use")
parser.add_argument("--input-file", "-i", dest="input_file", default=config.input_file, help="File to read messages from (one per line)")
parser.add_argument('--username', '-U', '--jabberid', dest='username', help="Sender's ID (a phone number for Signal, a Jabber Identifier (JID) aka. username for Jabber/XMPP")
parser.add_argument('--recipient', '-r', '--receiver', dest='recipient', action='append', help="Recipient's ID (e.g. '+12345678901' for Signal / JabberID (Receiver address) to send the message to)")
parser.add_argument('--group', '-g', dest='group', help="Group's ID (for Signal : a base64 string (e.g. 'mPC9JNVoKDGz0YeZMsbL1Q==')")
parser.add_argument('--stealth', dest='stealth', action="store_true", default=config.stealth, help="Activate stealth mode on any chosen chatter")
# Other core options
parser.add_argument('--password', '-P', dest='password', help="Senders's password")
parser.add_argument('--max-count', dest='max_count', type=int, default=config.max_count, help="Read this maximum number of responses before exiting")
parser.add_argument('--message', '-m', dest='message', help="Message to send. If missing, will read from --input-file")
parser.add_argument('--message-file', '-f', dest='message_file', type=argparse.FileType('r'), default=sys.stdin, help="File with the message to send. If missing, will be read from standard input")
parser.add_argument('--pattern', '-p', dest='patterns', action='append', nargs=2, help="Exits with status 0 whenever a message matches this pattern ; otherwise with status 1")
parser.add_argument('--timeout', '-t', dest='timeout', type=int, default=config.timeout, help="How much time t wait for an answer before quiting (in seconds)")
# Misc. options
parser.add_argument("--debug", "-d", action="store_true", dest='debug', default=False, help="Activate debug logs (overrides --verbosity)")
# Signal-specific arguments
parser.add_argument('--signal-cli', dest='signal_cli', default=config.signal_cli, help="Path to `signal-cli` if not in PATH")
parser.add_argument('--signal-stealth', dest='signal_stealth', action="store_true", default=config.signal_stealth, help="Activate Signal chatter's specific stealth mode")
# Jabber-specific arguments
# TODO
#
# 1st pass only matters for 'bootstrap' options : configuration file and logging
#
# Note : we don't let the parse_args method merge the 'args' into config yet,
# because it would not be possible to make the difference between the default values
# and the ones explictely given by the user
# This is usefull for instance to throw an exception if a file given by the user doesn't exist, which is different than the default filename
# 'config' is therefore the defaults overriden by user options while 'args' has only user options
args = parser.parse_args()
# Logging configuration
configure_logging(args.verbosity,debug=args.debug)
logging.debug( "Configuration for bootstrap : %s", repr(vars(args)) )
# Fills the config with user-defined default options from a config file
try:
# Allows config_file to be relative to the config_dir
config.config_file = filter_files(
[args.config_file,
os.path.join(args.config_dir,"config.yml")],
should_exist=True,
fallback_to=1 )[0]
logging.debug("Using config file %s",config.config_file)
with open(config.config_file,'r') as file:
# The FullLoader parameter handles the conversion from YAML
# scalar values to Python the dictionary format
try:
# This is the required syntax in newer pyyaml distributions
dictConfig = yaml.load(file, Loader=yaml.FullLoader)
except AttributeError:
# Some systems (e.g. raspbian) ship with an older version of pyyaml
dictConfig = yaml.load(file)
logging.debug("Successfully loaded configuration from %s : %s" % (config.config_file,repr(dictConfig)))
config.__dict__.update(dictConfig)
except OSError as e:
# If it was a user-set option, stop here
if args.config_file == config.config_file:
raise e
else:
logging.debug("Could not open %s ; no config file will be used",config.config_file)
logging.debug(e, exc_info=True)
pass
# From here the config object has only the default values for all configuration options
#
# 2nd pass parses all options
#
# Updates again the existing config object with all parsed options
config = parser.parse_args(namespace=config)
# From the bootstrap parameters, only logging level may need to be read again
configure_logging(config.verbosity,debug=config.debug)
logging.debug( "Final configuration : %s", repr(vars(config)) )
#
# From here the config object has default options from:
# 1. hard-coded default values
# 2. configuration file overrides
# 3. command line overrides
#
# We can now check the required options that could not be checked before
# (because required arguments may have been set from the config file and not on the command line)
#
# Creates the chat engine depending on the 'backend' parameter
if config.backend == "signal":
if not config.signal_cli:
raise ValueError("Could not find the 'signal-cli' command in PATH and no --signal-cli given")
if not config.username:
raise ValueError("Missing a username")
if not config.recipient and not config.group:
raise ValueError("Either --recipient or --group must be provided")
chatter = SignalChatter(
username=config.username,
recipient=config.recipient[0],
group=config.group,
signal_cli=config.signal_cli,
stealth=config.signal_stealth
)
# TODO :timeout=config.timeout
# By default (or if backend == "console"), will read from stdin or a given file and output to console
else:
chatter = ConsoleChatter(config.input_file,sys.stdout)
if config.stealth:
chatter = StealthChatter(chatter)
#
# Real start
#
bot = AskBot(
chatter=chatter,
message=config.message,
patterns=config.patterns,
max_count=config.max_count
)
status = bot.run()
print( json.dumps(vars(status)), file=sys.stdout, flush=True )

View file

@ -12,13 +12,22 @@ class ConsoleChatter:
def __init__( self, input=sys.stdin, output=sys.stdout ):
self.input = input
self.output = output
self.exit = False
def start( self, bot ):
# TODO Do it asynchronous (rather than testing self.exit between each instruction)
if self.exit:
return
for line in self.input:
bot.onMessage( line )
if self.exit:
return
logging.debug( "<<< %s", line )
bot.onMessage( line.rstrip() )
if self.exit:
return
def send( self, message ):
print( message, file=self.output, flush=True )
logging.debug( ">>> %s", message )
def stop( self ):
sys.exit(0)
self.exit = True

View file

@ -4,6 +4,7 @@
Helper functions
"""
import sys
import logging
@ -12,6 +13,27 @@ TRACE = 5
logging.addLevelName(TRACE,'TRACE')
def configure_logging( level=None, debug=None ):
"""
Sets default logging preferences for this module
if debug=True, overrides level with DEBUG
"""
if debug:
logLevel = logging.DEBUG
else:
try:
# Before Python 3.4 and back since 3.4.2 we can simply pass a level name rather than a numeric value (Yes !)
# Otherwise manually parsing textual log levels was not clean IMHO anyway : https://docs.python.org/2/howto/logging.html#logging-to-a-file
logLevel = logging.getLevelName(level.upper())
except ValueError:
raise ValueError('Invalid log level: %s' % level)
# Logs are output to stderr ; stdout is reserved to print the answer(s)
logging.basicConfig(level=logLevel, stream=sys.stderr, format='%(asctime)s\t%(levelname)s\t%(message)s')
def filter_files( files, should_exist=False, fallback_to=None ):
"""
files: a list of filenames / open files to filter

View file

@ -554,7 +554,7 @@ if __name__ == '__main__':
# Chatter-generic arguments
parser.add_argument("--backend", "-b", dest="backend", choices=["signal","console"], default=config.backend, help="Chat backend to use")
parser.add_argument("--input-file", "-i", dest="input_file", default=config.input_file, help="File to read messages from (one per line)")
parser.add_argument('--username', '-u', dest='username', help="Sender's number (e.g. +12345678901 for the 'signal' backend)")
parser.add_argument('--username', '-U', dest='username', help="Sender's number (e.g. +12345678901 for the 'signal' backend)")
parser.add_argument('--group', '-g', dest='group', help="Group's ID in base64 (e.g. 'mPC9JNVoKDGz0YeZMsbL1Q==' for the 'signal' backend)")
parser.add_argument('--recipient', '-r', dest='recipient', help="Recipient's number (e.g. +12345678901)")
parser.add_argument('--stealth', dest='stealth', action="store_true", default=config.stealth, help="Activate stealth mode on any chosen chatter")

View file

@ -0,0 +1,14 @@
# Each entry is a couple of [ name, pattern ]
patterns:
- [ "yes", "(?i)\\b(yes|ok)\\b" ]
- [ "no", "(?i)\\bno\\b" ]
- [ "cancel", "(?i)\\b(cancel|abort)\\b" ]
backend: console
#backend: signal
# Make sure to put quotes around the username field as it is a phone number for Signal
username: "+33123456789"
recipient: "+33123456789"
# Get this group ID with the command `signal-cli -u +33123456789 listGroups`
#group: "mABCDNVoEFGz0YeZM1234Q=="