Fix subpage links in MediaWiki

I haven't encountered any prefix other than "/wiki/" that should be
discarded. If there are such other prefixes, I think they would conform
to some pattern, and so the replacement code could be adjusted to
accommodate them.

This commit fixes #813.
Examples of pages with subpage links in English Wikipedia that are fixed
by this commit: "Asio (disambiguation)", "Asio C plus plus library".
This issue is much more prevalent in Wookieepedia because it has
a two-tab link system with the patterns */Legends and */Canon.
This commit is contained in:
Igor Kushnir 2018-01-28 22:15:03 +02:00
parent d65d9248ed
commit deea197ca7

View file

@ -411,8 +411,8 @@ void MediaWikiArticleRequest::requestFinished( QNetworkReply * r )
//fix src="/foo/bar/Baz.png"
articleString.replace( "src=\"/", "src=\"" + wikiUrl.toString() );
// Replace the href="/foo/bar/Baz" to just href="Baz".
articleString.replace( QRegExp( "<a\\shref=\"/([\\w\\.]*/)*" ), "<a href=\"" );
// Remove the /wiki/ prefix from links
articleString.replace( QRegExp( "<a\\shref=\"/wiki/" ), "<a href=\"" );
//fix audio
articleString.replace( QRegExp( "<button\\s+[^>]*(upload\\.wikimedia\\.org/wikipedia/commons/[^\"'&]*\\.ogg)[^>]*>\\s*<[^<]*</button>"),