Merge pull request #1 from guiraldelli/appscipt
From Python to Google Apps Script
This commit is contained in:
commit
e38c924f74
20
LICENSE.txt
Normal file
20
LICENSE.txt
Normal file
|
@ -0,0 +1,20 @@
|
|||
The MIT License (MIT)
|
||||
|
||||
Copyright (c) 2015 Ricardo H. Gracini Guiraldelli
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
||||
this software and associated documentation files (the "Software"), to deal in
|
||||
the Software without restriction, including without limitation the rights to
|
||||
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
|
||||
the Software, and to permit persons to whom the Software is furnished to do so,
|
||||
subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
|
||||
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
|
||||
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
|
||||
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
||||
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
5
README
5
README
|
@ -1,5 +0,0 @@
|
|||
cfp-bot is a app that process "Call For Papers" e-mails and automatically add the submition deadline dates to Google Calendar.
|
||||
|
||||
This app uses some third-party libraries, as:
|
||||
- parsedatetime [ http://code.google.com/p/parsedatetime/ ]
|
||||
- Google Data APIs [ http://code.google.com/p/gdata-python-client/downloads/list ]
|
52
README.md
Normal file
52
README.md
Normal file
|
@ -0,0 +1,52 @@
|
|||
# "Call For Papers" Bot
|
||||
This *bot* has been developed as an automatic processor of *"call for papers"*
|
||||
(CFP) emails received in the academic circles.
|
||||
|
||||
It was developed with the necessities of the
|
||||
[Adaptive Technologies Lab](http://lta.poli.usp.br/) in mind by one of its
|
||||
(former) students (me), not as a final product for widespread use. However,
|
||||
anyone is free to try it at his/hers own risk and, even better, collaborate for
|
||||
improvements on the existing code.
|
||||
|
||||
## License
|
||||
So far, it is still licensed under
|
||||
[MIT License](https://www.tldrlegal.com/l/mit). Maybe, someday, it will be
|
||||
even more open.
|
||||
|
||||
## Acknowledgments
|
||||
We make use of [Datejs](http://www.datejs.com/) library for parsing of the
|
||||
several human ways of writing dates. The library is licensed under
|
||||
the [MIT License](https://www.tldrlegal.com/l/mit).
|
||||
|
||||
## The Old Version
|
||||
This repository is originally "the house" of the old version written in Python.
|
||||
If you are used with GitHub and git, feel free to navigate and learn from it.
|
||||
Nonetheless...
|
||||
|
||||
## The New Version
|
||||
The current (new) version of this bot is a port of the old code from Python to
|
||||
Javascript—or, more correctly,
|
||||
[Google Apps Script](https://developers.google.com/apps-script/).
|
||||
Why? Very simple: because the bot is totally based in Google products and so we
|
||||
could use the Google Apps Script to trigger our bot automatically, moving to
|
||||
Google the responsibility of maintaining our `cron` jobs. Simple like this.
|
||||
|
||||
### It is **not** perfect
|
||||
I know it is not a perfect bot, but it does an amazing job in a very simple way.
|
||||
|
||||
Could I have used machine learning? Yes.
|
||||
|
||||
Could I have used neural network? Yes.
|
||||
|
||||
Could I have used Bayesian networks? Yes.
|
||||
|
||||
Could I have used [mechanical turks](https://en.wikipedia.org/wiki/The_Turk)?
|
||||
Yes, I could.
|
||||
|
||||
But, believe me, it was a night (or two) project to solve a problem we had and
|
||||
was making us crazy: full inbox of CFP emails.
|
||||
|
||||
I am very interested in using more intelligent approaches to solve this problem,
|
||||
but the probability it will happen tends to **zero**.
|
||||
|
||||
Anyhow, your collaboration, fork or whatever is welcome! :smile:
|
5
calendar_processor.js
Normal file
5
calendar_processor.js
Normal file
|
@ -0,0 +1,5 @@
|
|||
// connects to Google Calendar and creates an event in the default calendar
|
||||
// of the account
|
||||
function create_event(title, date){
|
||||
return CalendarApp.getDefaultCalendar().createAllDayEvent(title, date);
|
||||
}
|
133
cfp_bot.py
133
cfp_bot.py
|
@ -1,133 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
# Copyright (c) 2011, R. H. Gracini Guiraldelli <rguira@acm.org>
|
||||
# All rights reserved.
|
||||
|
||||
# Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
|
||||
|
||||
# Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
|
||||
# Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
|
||||
# Neither the name of the R. H. Gracini Guiraldelli nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
|
||||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
import sys
|
||||
import re
|
||||
import parsedatetime.parsedatetime as pdt
|
||||
import parsedatetime.parsedatetime_consts as pdc
|
||||
import gdata.calendar
|
||||
import gdata.calendar.service
|
||||
import gdata.service
|
||||
import atom
|
||||
import time
|
||||
import imaplib
|
||||
import array
|
||||
|
||||
def load_file(filename):
|
||||
fp = None
|
||||
try:
|
||||
fp = open(filename, "r")
|
||||
except:
|
||||
sys.stderr.write("File not found!\n")
|
||||
return fp
|
||||
|
||||
def find_key_line(line):
|
||||
pattern = r"((call\s+for\s+(paper|papers))|submission|deadline)"
|
||||
regex = re.compile(pattern, re.IGNORECASE | re.UNICODE)
|
||||
found = regex.search(line)
|
||||
return found
|
||||
|
||||
def find_date(line):
|
||||
# FIXME: Let global these regex components
|
||||
# FIXME: define a config file where these regex could be
|
||||
pattern = r"((\d{1,2}/\d{1,2}/(\d{4}|\d{2}))|(\d{4}-\d{2}-\d{2})|(\d{1,2}(st|nd|rd|th)*\s*(Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December)\s*(\d{4}|\d{2}))|((Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December)(,)?\s*\d{1,2}(st|nd|rd|th)*\s*(\d{4}|\d{2}))|((Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December)\s*\d{1,2}(st|nd|rd|th)*\s*(,)?\s*(\d{4}|\d{2})))"
|
||||
regex = re.compile(pattern, re.IGNORECASE | re.UNICODE)
|
||||
found = regex.search(line)
|
||||
return found
|
||||
|
||||
def connect_google_calendar(username, password):
|
||||
calendar_service = gdata.calendar.service.CalendarService()
|
||||
calendar_service.email = username
|
||||
calendar_service.password = password
|
||||
calendar_service.source = "Call For Papers App - Beta Version"
|
||||
calendar_service.ProgrammaticLogin()
|
||||
return calendar_service
|
||||
|
||||
def create_calendar_event(calendar_service, title, date):
|
||||
event = gdata.calendar.CalendarEventEntry()
|
||||
event.title = atom.Title(text=title)
|
||||
content = "Deadline for submission @ " + title
|
||||
event.content = atom.Content(text=content)
|
||||
start_time = time.strftime('%Y-%m-%d', date)
|
||||
event.when.append(gdata.calendar.When(start_time = start_time, end_time = start_time))
|
||||
new_event = calendar_service.InsertEvent(event, '/calendar/feeds/default/private/full')
|
||||
return new_event
|
||||
|
||||
def connect_imap_server(username, password):
|
||||
imap_connection = imaplib.IMAP4_SSL('imap.gmail.com', 993)
|
||||
imap_connection.login(username,password)
|
||||
return imap_connection
|
||||
|
||||
def disconnect_imap_server(imap_connection):
|
||||
imap_connection.close()
|
||||
imap_connection.logout()
|
||||
|
||||
def processing_emails(imap_connection):
|
||||
print ">> Processing e-mails..."
|
||||
status, count = imap_connection.select('[Gmail]/All Mail')
|
||||
print "\tYou have %s in the '[Gmail]/All Mail' folder." % (count)
|
||||
status, found = imap_connection.search(None, '(UNSEEN)')
|
||||
print "\tAnd you have %d unseen e-mail in that folder." % (len(found[0].strip()))
|
||||
regex = re.compile('(?<=(Subject:\s))(.*)')
|
||||
i = 0
|
||||
mails = []
|
||||
for mail_number in found[0].strip():
|
||||
try:
|
||||
status, data = imap_connection.fetch(mail_number, '(BODY[HEADER])')
|
||||
mail_header = regex.search(data[0][1])
|
||||
single_mail = []
|
||||
single_mail.append(mail_header.group(0).strip())
|
||||
status, data = imap_connection.fetch(mail_number, '(BODY[TEXT])')
|
||||
single_mail.append(data[0][1])
|
||||
if ( (single_mail[0] == '') or (single_mail[0] == None) or (single_mail[1] == '') or (single_mail[1] == None) ):
|
||||
print ">> WARNING: Message could not be processed. Flaggind it! <<"
|
||||
imap_connection.store(mail_number, '+FLAGS', '\\Flagged')
|
||||
mails.append(single_mail)
|
||||
except:
|
||||
print ">> WARNING: Could not fetch message of number %s <<" % (mail_number)
|
||||
return mails
|
||||
|
||||
def process_dates(mails):
|
||||
# TODO: I must improve it: what if it does not parse the date? The array index will go wrong!
|
||||
dates = []
|
||||
c = pdc.Constants()
|
||||
date_parser = pdt.Calendar(c)
|
||||
for single_mail in mails:
|
||||
parsed_date = None
|
||||
matched_line = find_key_line(single_mail[1])
|
||||
matched_date = find_date(single_mail[1])
|
||||
if ( (matched_line != None) and (matched_date != None) ):
|
||||
parsed_date = date_parser.parseDateText(matched_date.group(0))
|
||||
else:
|
||||
pass
|
||||
dates.append(parsed_date)
|
||||
return dates
|
||||
|
||||
def main():
|
||||
username = raw_input("E-mail address: ")
|
||||
password = raw_input("Password: ")
|
||||
imap_connection = connect_imap_server(username, password)
|
||||
mails = processing_emails(imap_connection)
|
||||
disconnect_imap_server(imap_connection)
|
||||
dates = process_dates(mails)
|
||||
calendar_service = connect_google_calendar(username, password)
|
||||
for i in range(0,len(dates)):
|
||||
single_mail = mails[i]
|
||||
date = dates[i]
|
||||
print "\t Subject: '%s' and Date: '%s'" % (single_mail[0], date)
|
||||
try:
|
||||
new_event = create_calendar_event(calendar_service, single_mail[0], date)
|
||||
except:
|
||||
print ">> ERROR: Event could not be added to Google Calendar! <<"
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
33
date_processor.js
Normal file
33
date_processor.js
Normal file
|
@ -0,0 +1,33 @@
|
|||
// regex pattern for finding the dates
|
||||
var REGEX_DATE = /((\d{1,2}\/\d{1,2}\/(\d{4}|\d{2}))|(\d{4}-\d{2}-\d{2})|(\d{1,2}(st|nd|rd|th)*\s*(Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December)\s*(\d{4}|\d{2}))|((Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December)(,)?\s*\d{1,2}(st|nd|rd|th)*\s*(\d{4}|\d{2}))|((Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December)\s*\d{1,2}(st|nd|rd|th)*\s*(,)?\s*(\d{4}|\d{2})))/i;
|
||||
// ISO format for dates
|
||||
var DATE_ISO_FORMAT = "yyyy-MM-dd";
|
||||
|
||||
// given a date represented as string, returns the date as a Date Javascript
|
||||
// object using the Datejs library
|
||||
function get_date(string_date){
|
||||
return Date.parse(string_date);
|
||||
}
|
||||
|
||||
// gets a mathced date from regex and converts to a string in the ISO format
|
||||
// using the Datejs library
|
||||
function get_iso_date(matched_date){
|
||||
return Date.parse(matched_date).toString(DATE_ISO_FORMAT);
|
||||
}
|
||||
|
||||
// returns the matched date found in the line
|
||||
function get_literal_date(line){
|
||||
var match = line.match(REGEX_DATE);
|
||||
if (match.length > 0){
|
||||
return match[0];
|
||||
}
|
||||
else{
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
// verifies if a line contains a date
|
||||
// returning a true value in positive case
|
||||
function has_date(line){
|
||||
return REGEX_DATE.test(line);
|
||||
}
|
39
email_connector.js
Normal file
39
email_connector.js
Normal file
|
@ -0,0 +1,39 @@
|
|||
var UNREAD_THREADS_QUERY = "is:unread";
|
||||
|
||||
// get an array unread threads from GMail
|
||||
function get_unread_threads(){
|
||||
return GmailApp.search(UNREAD_THREADS_QUERY);
|
||||
}
|
||||
|
||||
// given an array of unread GMail threads, composes an array of unread messages
|
||||
function get_unread_messages(unread_threads){
|
||||
return unread_threads.reduce(reduce_unread_messages, []);
|
||||
}
|
||||
|
||||
// reduce function in which all unread messages are merged in a single array
|
||||
function reduce_unread_messages(previousValue, currentValue){
|
||||
// var messages = currentValue.getMessages();
|
||||
// var unread = messages.filter(is_message_unread);
|
||||
// return previousValue.concat(unread);
|
||||
return previousValue.concat(currentValue.getMessages().filter(is_message_unread));
|
||||
}
|
||||
|
||||
// filter function which says if a GMail message is unread or not
|
||||
function is_message_unread(gmail_message){
|
||||
return gmail_message.isUnread();
|
||||
}
|
||||
|
||||
// returns the plain body text of a GMail message
|
||||
function get_message_text(gmail_message){
|
||||
return gmail_message.getPlainBody();
|
||||
}
|
||||
|
||||
// returns the subject text of a GMail message
|
||||
function get_subject_text(gmail_message){
|
||||
return gmail_message.getSubject();
|
||||
}
|
||||
|
||||
// main function, which returns the plain text body of all unread GMail messages
|
||||
function get_body_all_unread_messages(){
|
||||
return get_unread_messages(get_unread_threads()).map(get_message_text);
|
||||
}
|
42
email_processor.js
Normal file
42
email_processor.js
Normal file
|
@ -0,0 +1,42 @@
|
|||
// regex patter for finding the "key line" that classifies the email as "call for papers"
|
||||
var REGEX_KEY_LINE = /((call\s+for\s+(paper|papers))|submission|deadline)/i; //ignore case
|
||||
// regex for the line which contains paper submission deadline information
|
||||
var REGEX_PAPER_DEADLINE = /(paper(s)?)*(submission|deadline)(paper(s)?)*/i;
|
||||
// regex pattern of forward text in subject of emails
|
||||
var REGEX_FORWARD = /^\s*(fw(d)?|en(c)?):\s*/i;
|
||||
|
||||
|
||||
// verifies if a line contains the information of a call for paper email,
|
||||
// returning a true value in positive case
|
||||
function has_key_line(line){
|
||||
return REGEX_KEY_LINE.test(line);
|
||||
}
|
||||
|
||||
// verifies if the line contain the keywords for paper submission deadline date,
|
||||
// returning a true value in positive case
|
||||
function is_paper_deadline(line){
|
||||
return REGEX_PAPER_DEADLINE.test(line);
|
||||
}
|
||||
|
||||
// removes forward text in subject
|
||||
function remove_forward(subject_text){
|
||||
return subject_text.replace(REGEX_FORWARD, EMPTY_STRING);
|
||||
}
|
||||
|
||||
// takes a GmailMessage object and process it, extracting
|
||||
function process_email(gmail_message){
|
||||
var subject = remove_forward(get_subject_text(gmail_message));
|
||||
var lines_of_interest = break_lines(get_message_text(gmail_message)).filter(has_date).filter(is_paper_deadline);
|
||||
// process only one entry of lines of interest
|
||||
if (lines_of_interest.length > 0){
|
||||
var date = get_date(get_literal_date(lines_of_interest[0]));
|
||||
var calendar_event = create_event(subject, date);
|
||||
if (calendar_event == null){
|
||||
Logger.log("It was not possible to create an event with the following details:\n\tSubject: %s\n\tDate: %s", subject, date);
|
||||
gmail_message.star();
|
||||
}
|
||||
else{
|
||||
gmail_message.markRead();
|
||||
}
|
||||
}
|
||||
}
|
6
main.js
Normal file
6
main.js
Normal file
|
@ -0,0 +1,6 @@
|
|||
// main function, which gets all unread GMail messages and process them all
|
||||
function main(){
|
||||
Logger.log("Initiating the process of the call for papers emails...");
|
||||
get_unread_messages(get_unread_threads()).map(process_email);
|
||||
Logger.log("Execution is over.");
|
||||
}
|
Loading…
Reference in a new issue