When a new Release is done, the iso and torrent file might not be
rsync'd yet to the archweb server, which makes torrent clients which
watch the feed unable to retrieve the torrent file and give up. As this
behaviour is unwanted use the Release model torrent data which is always
availabe.
This change introduces a replacment for planet.archlinux.org which uses
a python 2 project to generate static html from multiple RSS feed
sources. For archweb a set of 'static' feeds can be created in the
django admin view for the Arch forums and other static feeds, archweb
users can add their own blog rss feed in their profile which will create
a Feed model.
When running the update_planet command, all Feed models are iterated
over and the rss feed is parsed. The latest FeedItem is queried matching
the current Feed model and every newer entry in the RSS feed is added as
new FeedItem. Since the body is also stored in the FeedItem there is a
limit to the amount of FeedItems per Feed configured in settings.py of
which the default is 25.
When a user is marked as inactive his Feed model and items are
removed automatically to avoid keeping stale data around.
Closes: #261
We have Last-Modified here, and from what I can tell with some more
reading and playing with caching, it isn't necessarily wise to set both
of them in the same response. Set the one that we actually trust.
Signed-off-by: Dan McGee <dan@archlinux.org>
We had this elaborate system set up with caching and invalidation, which
is overkill since we cache the result of the view anyway. Just hit the
database when needed to find the last change to the respective model
class and be done with it.
Signed-off-by: Dan McGee <dan@archlinux.org>
The XML generation underlying our package feeds was doing 1600+ calls to
the write() method on the outfile. For some reason, the Python standard
library insists on calling flush() after every write, which really makes
performance take a nosedive. Wrap the write calls and do them in batches
to remove some of the overhead and make feed generation a bit snappier.
Signed-off-by: Dan McGee <dan@archlinux.org>
No need to call out to the template engine to format... nothing at all.
Just fetch the attribute directly and save the render step.
Signed-off-by: Dan McGee <dan@archlinux.org>
Add a file_size field which we will use in the RSS feed, and also add a
field for future storage of the torrent data itself.
Signed-off-by: Dan McGee <dan@archlinux.org>
This should hopefully allow these views to not be labeled as
'_wrapped_view' in performance monitoring output.
Signed-off-by: Dan McGee <dan@archlinux.org>
If you wanted to see all updates regardless of architecture for
[testing] before, there wasn't really a way to do so. Add one.
Signed-off-by: Dan McGee <dan@archlinux.org>
This will prevent [staging] packages from cluttering normal user's view
on the website, but allow us to still import everything from this
repository for developer use.
Signed-off-by: Dan McGee <dan@archlinux.org>
We were actually using the postdate attribute rather than last_modified,
which means any News objects that get edited would not trigger an update
of the feed.
Signed-off-by: Dan McGee <dan@archlinux.org>
For a Package object query, we almost always did .select_related('arch',
'repo). Refactor this into the manager as a 'normal()' method so we can
avoid sprinkling the same logic everywhere.
Signed-off-by: Dan McGee <dan@archlinux.org>
Get the stuff used to retrieve and refresh the latest date values all in
the same place, and make it a bit more beautiful by refactoring it all
into a common set of methods.
Signed-off-by: Dan McGee <dan@archlinux.org>
Implement 'tag:' style URIs for the GUID field on our RSS feeds. This
ensures new package updates show up as new, and we aren't jumping back
and forth between generated GUIDs having 'http://' and 'https://'
prefixes.
Much of the work here is to attempt to keep old news GUIDs constant so
we don't once again make everything show up as new in newsreaders.
Signed-off-by: Dan McGee <dan@archlinux.org>
This worked in MySQL because of it's case-insensitive matching, but does
not work in other databases unless we coerce the value.
Signed-off-by: Dan McGee <dan@archlinux.org>
We need to do this in the models.py files, otherwise the post_save signal
might not be connected right away on launch of the application. Move them
over there, add a dispatch_uid so it only gets hooked up once, and do some
other function moving around so we don't have circular imports.
Signed-off-by: Dan McGee <dan@archlinux.org>
Use a 'set to None' sentinel to indicate data updates are in progress and we
need to hold off a bit on caching a new value. This logic is gleamed from
the "Scaling Django" slides presented by Mike Malone and available freely on
SlideShare.
Signed-off-by: Dan McGee <dan@archlinux.org>
We were cheating before and using non-UTC times; adjust the values we get
back from the database as appropriate so our times are not bogus.
Signed-off-by: Dan McGee <dan@archlinux.org>
This saves two database queries each request, meaning no database hits at
all if we are just going to return a 304 response. It also requires adding a
post_save signal to ensure our cache is updated with the correct latest news
date upon saving a news item.
Signed-off-by: Dan McGee <dan@archlinux.org>
By using the condition decorator (in a slightly odd way because these are
class-based views), we can cut down a lot on the response time for returning
304 status code for feeds that haven't changed. The decorator means we no
longer have to completely render the view to see if we can return a 304
status code.
Signed-off-by: Dan McGee <dan@archlinux.org>
We had a bunch of extra imports, non-conventional variable names, spacing
issues, etc. that were relatively low-hanging fruit to clean up. Fix them
and make the code a bit cleaner in the process.
Signed-off-by: Dan McGee <dan@archlinux.org>
Feeds are now views-based and don't need the dictionary anymore.
get_object() now takes named arguments as well making it a bit more
understandable when reading the code.
Signed-off-by: Dan McGee <dan@archlinux.org>
Make the feed framework a lot more flexible and give the possibility to have
a feed for each architecture. You can drill down even more than also get a
feed for a particular repo; some might find this helpful for something like
tracking [testing]. Implements FS#12939.
I also bumped up the number of items available in each of these feeds; since
it is full of a bunch of small items it might be more helpful to have more
available and it should also prevent fewer ones from being missed.
The UI isn't exactly spectacular, but I figured some sort of page is better
than none listing all the various feeds you can pull from.
Signed-off-by: Dan McGee <dan@archlinux.org>