Commit graph

4299 commits

Author SHA1 Message Date
Igor Kushnir df192bf555 Don't show duplicate MediaWiki articles
Duplicate articles can be shown when the alts collection is not empty
and a MediaWiki site redirects multiple words to a single page. The
alts collection can be populated when:
* option Preferences=>Advanced=>"Extra search via synonyms" is enabled;
* a Morphology dictionary is active;
* a translation of a phrase is requested in a way that makes GoldenDict
  pass the input phrase to Preferences::sanitizeInputPhrase().

Steps to reproduce 1:
1. Create and switch to a dictionary group with (1) "English Wikipedia"
   and (2) "English (US) Morphology" dictionaries in it.
2. Request a translation of the word "plays" (without quotes).

Steps to reproduce 2:
1. Create a dictionary group with "English Wiktionary" dictionary in it;
   switch to this group in the scan popup window (or in the main window
   if the Preferences=>Scan Popup=>"Send translated word to main window"
   option is enabled).
2. Select the word "i.e." (without quotes) and press Ctrl+C+C (or
   whatever hotkey is configured to translate a word from clipboard).
2021-06-17 12:06:36 +03:00
Igor Kushnir 99ddb7686e Don't add the same phrase to history twice in a row
Each of the 3 removed history addition requests follows a call to
ArticleView::showDefinition() with the same phrase/word as an argument.
Each showDefinition() overload adds its phrase/word argument to history.

These duplicate history additions weren't noticeable because
History::addItem() searches for and removes its argument from items to
avoid duplicate history entries. But the extra function calls, signal
emissions, linear searches and QList manipulation wasted processor time.
2021-06-17 12:06:36 +03:00
Igor Kushnir 60bc05218f Add input phrase's punctuation suffix to alts
Preferences::sanitizeInputPhrase() transforms an input phrase by
removing its whitespace/punctuation prefix and suffix. Translating a
phrase from X11 primary selection or from clipboard, via mouse-over or
from the command line results in such sanitization. This is useful when
a punctuation mark or a space is selected accidentally alongside a word.
This sanitization can be undesirable, however, when an abbreviated word
is selected. For example: "etc.", "e.g.", "i.e.".

This commit implements searching for the input word with the punctuation
suffix preserved as an alternative form of the sanitized word to show
articles for both. For example, when the word "etc." is translated from
the clipboard, both "ETC" and "etc." articles are displayed.

The punctuation suffix is preserved when the word is passed from the
scan popup to the main window and when the translate line text is
refreshed (e.g. when the current group is changed). The suffix is not
stored in history and favorites (doing so would require file format
changes and possibly substantial code changes, this can be implemented
later if need be).

Trim the input phrase once in ArticleNetworkAccessManager::getResource()
instead of verbose trimming in multiple places in
ArticleMaker::makeDefinitionFor().

Closes #1350.
2021-06-17 12:06:36 +03:00
Igor Kushnir 57c4c33780 Add support for *.oga audio files
For example, the first audio link in "The United States" English
Wikipedia article - "The Star-Spangled Banner" - ends with ".oga".
Without this commit the audio link is not recognized by GoldenDict:
* it is not pronounced when a Preferences=>Audio=>"Auto-pronounce..."
  option is enabled;
* clicking on the link opens it in the default browser instead of
  playing inside GoldenDict.
2021-06-12 07:53:19 +03:00
Abs62 b2e673961d Add hi_IN.ts to project file 2021-03-09 17:53:48 +03:00
Gleb Nemshilov c550657222 Fix hi_IN.ts tags and language code 2021-03-09 13:40:53 +07:00
Abs62 968690654f 1. Fix work with big index files
2. Increase limit of node elements while build index
2021-03-05 16:51:44 +03:00
proletarius101 9fb467abf3 Add a concrete Comment into the desktop entry 2021-02-27 20:04:28 +08:00
proletarius101 38d7193f49 Rename id in metadata and desktop entry to org.goldendict.GoldenDict
https://www.freedesktop.org/software/appstream/docs/chap-Metadata.html#tag-id-generic

https://specifications.freedesktop.org/desktop-entry-spec/desktop-entry-spec-latest.html#file-naming
2021-02-27 20:04:28 +08:00
Abs62 7db077bd03 DSL: Don't convert escaped spaces into non-breakable inside [s] tag 2021-02-13 11:39:05 +03:00
Abs62 72dfe25ff3 DSL: Fix resource loading in some cases 2021-02-13 11:23:00 +03:00
data-man 278c143a05 Use std::sort in suitable places 2021-02-01 15:00:31 +00:00
Nikolay Korotkiy 73ec1b5950
DICT servers: ignore refs in html 2021-01-15 18:15:42 +03:00
Abs62 c6f8d29a5a Fix tr_TR.ts (issue #1338) 2021-01-12 20:36:38 +03:00
omerfaruk-cakmak 4834396a86
Update tr_TR.ts 2021-01-10 20:32:13 +03:00
Igor Kushnir 261e45a5d7 MediaWiki: remove the obsolete "fix audio" replacement
I have searched for the "<button" string and even for the "<\s*button"
pattern in tens of articles from all 5 default Wikipedia and all 5
default Wiktionary sites. Found none. I assume this pattern is obsolete.
Removing this useless code improves performance by doing less searching.

I have run the following command on directories that contained many
Wikipedia and Wiktionary articles received by GoldenDict:
  pcregrep -MrI --buffer-size 20M '<\s*button' DIR-WITH-ARTICLES
2020-12-09 12:19:46 +02:00
Igor Kushnir dec59439b9 MediaWiki: remove the /wiki/ prefix from links w/o regexp
This string replacement is 3-5 times faster than the QRegularExpression
replacement in "The United States" and "Paris" English Wikipedia
articles on my GNU/Linux system.

Before fe39fc8a05 the pattern started with
"<a\\shref=" instead of the current "<a\\s+href=", and no related bug
has been reported. I haven't encountered any whitespace character other
than space in this position. I believe that a single tab or a single EOL
character do not make sense after "<a". So a regression is unlikely.

I have searched for a tab or a newline character after "<a" and for a
whitespace character after "<a " in tens of articles from all 5 default
Wikipedia and all 5 default Wiktionary sites. Found none.

I have run the following command on directories that contained many
Wikipedia and Wiktionary articles received by GoldenDict:
  pcregrep -MrI --buffer-size 20M "$PATTERN" DIR-WITH-ARTICLES
with PATTERN='<a(\t|\n)' and PATTERN='<a \s+href'.
2020-12-09 12:19:10 +02:00
Igor Kushnir b7da546dd5 Fix string case in all files: Goldendict => GoldenDict
Run the following string-replacement command in my GNU/Linux system:
    git grep -l Goldendict | xargs sed -i 's/Goldendict/GoldenDict/g'
2020-11-19 18:36:35 +02:00
Abs62 06177c31f6
Merge pull request #1317 from vedgy/minor-preferences-improvements
Minor Preferences improvements
2020-11-19 19:15:44 +03:00
Igor Kushnir ecb8f54293 Preferences: disable spinboxes when their checkboxes are unchecked
Neither of the two int options has any effect if the corresponding bool
option is turned off.
2020-11-19 18:00:44 +02:00
Igor Kushnir aba8997438 Increase inputPhraseLengthLimit option's maximum
Those who use GoldenDict to translate long text with Google Translate
or pronounce it with a text-to-speech engine (see a discussion in
comments under #1313) may still want to limit the input phrase length.
But they might prefer a limit greater than the current maximum - ten
thousand symbols. Ten million minus one symbols should be generous
enough. I don't want to increase the maximum further to avoid excessive
widening of spinboxes in the Preferences UI. Besides, a ten megabyte
input phrase freezes GoldenDict's UI with high CPU usage for a minute on
my system.
2020-11-19 17:58:35 +02:00
Abs62 3ebba33f10 Update help system 2020-11-19 18:33:35 +03:00
Abs62 61e25924be Lupdate all translations, update Russian translation 2020-11-19 18:31:18 +03:00
Igor Kushnir dea11ca080 inputPhraseLengthLimit option's single step: 5 => 10
This speeds up decreasing the default value that is probably too large
for most users. I think that very few users would want to tune this
option's value finer than 10. Those who need such precision can enter
the desired number manually.
2020-11-19 12:56:29 +02:00
Igor Kushnir 27ca24f83d Fix indentation of recently added code: 4 => 2 spaces 2020-11-19 12:49:23 +02:00
Igor Kushnir 193aa4e31d Set up network disk cache for articleNetMgr
When a Wikipedia article is already cached, this change reduces the
amount of sent and received network data almost tenfold.

Setting up a network disk cache in the same way for dictNetMgr does not
noticeably impact the amount of network traffic. Either this network
access manager sends and receives very little data or the data is never
the same. So dictNetMgr does not need a disk cache.

Use QNetworkDiskCache's default maximum size of 50 MiB as the default
network cache size. This size is large enough to accommodate tens of
huge MediaWiki articles. It is also small enough that the user is
unlikely to run out of disk space because of the cache.

Clear network cache on exit by default because most users probably
don't load the same online articles after restarting GoldenDict. Plus
storing the network cache on disk indefinitely by default would be a new
and unexpected to the users privacy risk.

Nikita Moor came up with the idea and wrote an initial network disk
cache implementation in #1310.
2020-11-18 19:04:28 +02:00
Abs62 8364b04dc0 Update help system 2020-11-16 18:13:24 +03:00
Abs62 425fe32de7 Lupdate all translations, update Russian translation 2020-11-16 18:12:56 +03:00
Abs62 c02915d5e5 1. Set limit input phrase length option turned off by default
2. Set input phrase length limit to 1000 by default
2020-11-16 18:12:17 +03:00
Igor Kushnir 61eb4e08fe Add options to limit input phrase length
When a long text is accidentally selected or copied while the scan popup
is on, Goldendict spends a lot of CPU time to gradually create a
"No translation..." article. When translation of huge (e.g. 15 MB) text
from the clipboard is (accidentally) requested, Goldendict freezes for a
while. Turning the added input phrase limit option on eliminates this
waste of the CPU time.

I have implemented these options primarily for selection and clipboard,
but they also affect mouse-over translation on Windows and command line
translation requests. This is mostly because I did not bother to limit
the options' scope. I guess hovering over an extremely long text without
spaces (e.g. Base64-encoded) could cause the same performance issue on
Windows. The command-line translation could be requested from a script
that integrates Goldendict with some other application, from which long
text could be sent for translation by accident.

I hope that the default value of 200 characters will be sufficient for
just about any real-world user input in any language. The option is on
by default, because the default length limit is generous and any longer
text is unlikely to be sent for translation intentionally. My personal
preference for the input phrase length limit is 100 symbols.

ArticleView::pasteTriggered() didn't call QString::simplified() on the
text retrieved from the clipboard. I assumed this was an oversight, so
now it *is* called - indirectly, via Preferences::sanitizeInputPhrase().
2020-11-13 17:44:38 +02:00
Igor Kushnir 0b7c65023c articleSizeLimit option: minor code cleanup
When an integral value is converted to a signed integer type, if the
result is not representable, the resulting value is implementation
defined (until C++20). Convert the string value from configuration file
to the target type (int) to avoid the redundant type conversion.

Use the more direct and efficient QSpinBox::value() instead of
QAbstractSpinBox::text().toInt().
2020-11-06 22:04:59 +02:00
JingMatrix 63aeb0ef6d quote possible apostrophe
The French translation of "Collapse article" contains an apostrophe.
2020-11-02 23:09:22 +01:00
Passw 30a61c0ee0
remove dead assignment in mediawiki.cc 2020-10-25 20:12:33 +08:00
Abs62 9fb33d10bd Disable focus acquiring by "New tab" button 2020-10-22 20:41:37 +03:00
Abs62 5d7d553bb5 Dsl: Fix displayed headword selection for the case of ignore diacritic 2020-10-22 18:32:55 +03:00
Passw 332a95b271
Fix typo in btreeindx.hh 2020-10-20 10:49:37 +08:00
Vitaly Zaitsev 9ca9184eab
Fixed build with GCC 11. 2020-10-15 15:04:20 +02:00
Passw abe8e9efad
Fix bug on not showing default programs under Linux/macOS 2020-10-10 14:49:36 +08:00
Abs62 69e37a3d90 DSL: Show multi-word unknown tags 2020-10-09 20:49:35 +03:00
Abs62 8e13789e8f Zim: Handle clusters in random order case 2020-10-06 17:52:30 +03:00
Abs62 b70c3e8c88 DSL: Show unknown tags 2020-10-05 18:44:53 +03:00
Abs62 8126ea71a9 Merge branch 'patch-1' of https://github.com/CyrusYip/goldendict into Temp 2020-10-05 18:42:30 +03:00
CyrusYip 4f0d529d95
Add uninstallation method to README 2020-10-02 05:59:19 +08:00
Jose Riha 1a56fdde08 Update Slovak translation 2020-09-30 21:34:48 +02:00
Abs62 dda311cf9c
Merge pull request #1247 from 12101111/fix-clang10
Remove unused wrong code.
2020-07-19 13:38:12 +03:00
Carmina16 17aea140d3
Add ie_001.ts
Setup the Interlingue translation
2020-07-19 16:55:12 +07:00
Carmina16 78cdbcae50
Add Interlingue translation
Setup the Interlingue translation
2020-07-19 16:53:21 +07:00
Abs62 77d33240e4 Some fixes from hrimfaxi 2020-07-18 22:25:55 +03:00
Abs62 6395d8e766 Add libswresample-dev to travis.yml 2020-07-18 22:00:30 +03:00
Abs62 05f921d014 Mac-specific: Add zstd library 2020-07-18 21:51:35 +03:00