[Upstream] Words and Character excluding spaces Word Count incorrect with Record Changes enabled

Bug #981033 reported by komputes
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
LibreOffice
Fix Released
Medium
libreoffice (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

1) lsb_release -rd
Description: Ubuntu 12.04 LTS
Release: 12.04

2) apt-cache policy libreoffice-writer
libreoffice-writer:
  Installed: 1:3.5.2-2ubuntu1
  Candidate: 1:3.5.2-2ubuntu1
  Version table:
 *** 1:3.5.2-2ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise/main i386 Packages
        100 /var/lib/dpkg/status

3) What is expect to happen in Writer in a blank document is paste the following text:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus eu ligula
et arcu dapibus viverra ac ut elit. Proin rhoncus sapien et velit cursus ac
molestie justo malesuada. Aliquam pretium, orci nec malesuada laoreet, nisl
nisi tristique dui, vitae rutrum ipsum libero sit amet nunc.

Activate record changes via Edit -> Changes -> Record, highlight everything from:
Proin

until:
nunc.

delete it, and the Word Count shows as it does in Word 2010 screenshot:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/981033/+attachment/3134188/+files/word2010.png

4) What happens instead is it shows:
Words: 45
Characters: 113
Characters excluding spaces: 245

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: libreoffice-calc 1:3.5.1-1ubuntu4
ProcVersionSignature: Ubuntu 3.2.0-21.34-generic-pae 3.2.13
Uname: Linux 3.2.0-21-generic-pae i686
ApportVersion: 2.0-0ubuntu4
Architecture: i386
Date: Fri Apr 13 14:22:48 2012
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Alpha i386 (20120222)
ProcEnviron:
 TERM=xterm
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: libreoffice
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
In , mogliii (mogliii) wrote :

Created attachment 57796
Screenshot of text showing also the Word Count window

Problem description:

Steps to reproduce:
1. Open new writer document and paste the following text:
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus eu ligula et arcu dapibus viverra ac ut elit. Proin rhoncus sapien et velit cursus ac molestie justo malesuada. Aliquam pretium, orci nec malesuada laoreet, nisl nisi tristique dui, vitae rutrum ipsum libero sit amet nunc."

2. Open Tools -> Word Count, you will see
Words: 45
Characters: 289
Characters excluding spaces: 245

3. Activate tracking of changes
Edit -> Changes -> Record

4. Mark everything from "Proin" until "nunc." and delete.

5. Word Count now shows
Words: 45
Characters: 57
Characters excluding spaces: 245

Current behavior:
Parts deleted while tracking changes only affects "Characters" in the Word Count.

Expected behavior:
Changes affect either all three counts, or none.

Platform (if different from the browser):
Windows 7 64bit
LibreOffice 3.5.0rc3
Build ID: 7e68ba2-a744ebf-1f241b7-c506db1-7d53735

Revision history for this message
komputes (komputes) wrote :
Changed in df-libreoffice:
importance: Unknown → Medium
status: Unknown → New
penalvch (penalvch)
summary: - Word count not working well with "Changes -> record". Inconsistent.
+ [Upstream] Word count not working well with "Changes -> record".
+ Inconsistent.
Revision history for this message
Launchpad Janitor (janitor) wrote : Re: [Upstream] Word count not working well with "Changes -> record". Inconsistent.

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libreoffice (Ubuntu):
status: New → Confirmed
Revision history for this message
In , Jmrecarey (jmrecarey) wrote :

[Not Reproducible] with "LibreOffice 3.3.4 - Ubuntu 11.04 (32bit) Spanish UI"

Revision history for this message
penalvch (penalvch) wrote : Re: [Upstream] Word count not working well with "Changes -> record". Inconsistent.
penalvch (penalvch)
description: updated
summary: - [Upstream] Word count not working well with "Changes -> record".
- Inconsistent.
+ [Upstream] Words and Character excluding spaces Word Count incorrect
+ with Track Changes enabled
Changed in libreoffice (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
summary: [Upstream] Words and Character excluding spaces Word Count incorrect
- with Track Changes enabled
+ with Record Changes enabled
description: updated
Revision history for this message
In , penalvch (penalvch) wrote :

1) lsb_release -rd
Description: Ubuntu 12.04 LTS
Release: 12.04

2) apt-cache policy libreoffice-writer
libreoffice-writer:
  Installed: 1:3.5.2-2ubuntu1
  Candidate: 1:3.5.2-2ubuntu1
  Version table:
 *** 1:3.5.2-2ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise/main i386 Packages
        100 /var/lib/dpkg/status

3) What is expect to happen in Writer in a blank document is paste the following text:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus eu ligula
et arcu dapibus viverra ac ut elit. Proin rhoncus sapien et velit cursus ac
molestie justo malesuada. Aliquam pretium, orci nec malesuada laoreet, nisl
nisi tristique dui, vitae rutrum ipsum libero sit amet nunc.

Activate record changes via Edit -> Changes -> Record, highlight everything from:
Proin

until:
nunc.

delete it, and the Word Count shows as it does in Word 2010 screenshot:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/981033/+attachment/3134188/+files/word2010.png

4) What happens instead is it shows:
Words: 45
Characters: 113
Characters excluding spaces: 245

description: updated
Revision history for this message
In , penalvch (penalvch) wrote :

*** Bug 48072 has been marked as a duplicate of this bug. ***

Changed in df-libreoffice:
status: New → Confirmed
Revision history for this message
In , Sasha-libreoffice (sasha-libreoffice) wrote :

reproduced in 3.5.3 on Fedora 64 bit
not reproduced in 3.3.4 , therefore regression
problem only in Tools->Word count, no problem in File->Properties->Statistics

Revision history for this message
In , Muhammad Haggag (mhaggag) wrote :

(In reply to comment #4)
> reproduced in 3.5.3 on Fedora 64 bit
> not reproduced in 3.3.4 , therefore regression
> problem only in Tools->Word count, no problem in File->Properties->Statistics

Are you sure file statistics aren't suffering from the same problem? It's showing 45 for me (same as the word count dialog and status bar).

Also, when you go ahead and save the document, it updates the statistics differently from how the word count dialog does it--it actually counts characters marked for deletion (286), and so the word count dialog shows the same count (since it's seeded from the document statistics). As soon as you start typing, the word count code is invoked and the number of characters becomes 113 again.

The problem is that character counting masks text marked as deleted and hidden text by replacing it with spaces, but all other word/character counting code doesn't. It seems intentional, although the "Why" isn't clear to me. See SwTxtNode::CountWords and its call to lcl_MaskRedlinesAndHiddenText: http://opengrok.libreoffice.org/xref/core/sw/source/core/txtnode/txtedt.cxx#1864

I tracked the change with 'git blame' to the following commit by John LeMoyne Castle:
http://cgit.freedesktop.org/libreoffice/core/commit/?id=4bd28ba4c6d2af96bb6638b88635598e1bb88e8f

Unfortunately, the commit message doesn't explain why it's doing character masking. A google search for "John LeMoyne Castle character count" leads to fdo#30550: https://bugs.freedesktop.org/show_bug.cgi?id=30550

It looks like the initial work was done by Mattias Johnsson, then John fixed several bugs. It seems the intent of his commit was to fix the selection case only. It might be that the character masking bit was erroneously added, perhaps a left over from another commit.

My recommendation is to remove the masking of deleted characters, since it'll be a lot of work to get that working properly (and consistently) with both word/character count and document statistics, for no obvious benefit. If there's demand for such a feature, it should be filed and tracked separately.

I'll be posting a patch shortly to remove the masking and make the behavior consistent.

Revision history for this message
In , Muhammad Haggag (mhaggag) wrote :

Created attachment 62923
Proposed patch.

After looking at the code closely, I change my recommendation. It's actually straight-forward to get the word counting code to consistently ignore deleted content. Patch attached.

One problem that remains with this patch is document statistics. When you save the document, a gross word count is computed (including deleted content), saved with the document, and seeded to the word count dialog. The word count dialog (and status bar) show the incorrect count until you edit the document (insert/delete something), at which point the proper word counting logic is invoked and the count is corrected.

I won't post the patch for review/commit yet in the hopes that I can fix the document statistics issue as well. If it looks complicated, I'll get this committed and pursue the statistics issue separately.

Revision history for this message
In , Muhammad Haggag (mhaggag) wrote :

Created attachment 62967
Updated patch.

The reason document statistics is broken comes down to the following commit: http://cgit.freedesktop.org/libreoffice/core/commit/?id=6af264883910fe31433b4164b1956f4f9ed75ecb

It disables redlining deleted changes (by removing the flag REDLINE_SHOW_DELETE) when exporting (saving) documents, which leads to SwTxtNode::CountWords counting the deleted changes instead of masking them (since it only masks redlined content).

It appears that was done as a bug fix. Unfortunately, it was fixing a bug from 2001, and information on such bugs are not available anymore. The attached patch leaves the broken document statistics behavior as is, and I'll file a separate bug to track it.

Changed in df-libreoffice:
status: Confirmed → In Progress
Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Muhammad Haggag committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6c14d15dbbdc8920e1695b5fdc32b6519508815d

fdo#46757 Word/character count incorrect with record changes enabled

Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Muhammad Haggag committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=553f9ccfc8a6048528b9ffcd535adf7f1cd51fc7&g=libreoffice-3-6

fdo#46757 Word/character count incorrect with record changes enabled

It will be available in LibreOffice 3.6.

Revision history for this message
In , Libreoffice-bugs (libreoffice-bugs) wrote :

Caolan McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=03a59c7096cde0ced1a88069647c3ec60f86f9d6

Regression test for fdo#46757

Revision history for this message
In , Stefan K. (astron) wrote :

Setting to FIXED.

Thank you, Muhammad!

Revision history for this message
In , Stefan K. (astron) wrote :

*** Bug 50590 has been marked as a duplicate of this bug. ***

Changed in df-libreoffice:
status: In Progress → Fix Released
Revision history for this message
Björn Michaelsen (bjoern-michaelsen) wrote :

released with 3.6 to quantal

Changed in libreoffice (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.