Hi Folks,
In response to dom's comments on bug 6427..
PT should lazy-create blocks, pages, etc... if none exist
i don't believe that this is an import-msword problem. i believe that it is a
much more general problem - one that uwog encountered before.
what we're seeing here is an empty section - i.e.
<section props="">
</section>
i think that it is entirely valid to have an empty section. i also think
that it
is entirely valid for uwog's
<section>
<table>...</table>
</section>
case. if on import, <section></section> does not create a fp_Page, the piece
table should lazy-create one for it.
------- Additional Comment #12 From Dom Lachowicz 2004-09-18 02:59 -------
crash avoided. keeping bug open, though, because i feel there is a lot
more that
should happen behind the scenes wrt robustness.
......................................................................
There are two things here.
Firstly in regard to having <section><table>....</table>
AbiWord should indeed allow this. It would prolly be a few full days work to
implement this for me. However I don;t think this is a pressing issue
right now.
It's not that I don;t think this is a good idea, its just that I want to
spend the time I have on abiword in different ways.
In regard to the dom second point that we should correct bad documents
automatically. I think this is an excellent idea and would gaurentee abi
would crash on load even more rarely.
AbiWord has a number of rules that we try to make the document loader
adhere to.
These are simple and can be coded to to auto-correct invalid documents.
At the end of the document load, we can scan the document, check for
these rules and correct the document.
Here they are..
1 Every <section> must be followed by a <block> strux.
2. Every fmtMark, field, text or object frag must be preceded by
(a) another these frags
(b) A <block> strux
(c) An </footnote> or </endnote> strux
3. Every <table> strux must be followed by a <cell> strux.
4. Every <cell> strux must be followed by a <block> strux.
5. Every </cell> strux must be followed by either:
(a) <cell> strux
(b) </table> strux
6. Within <footnote> ... </footnote> and <endnote> ... </endnote> strux's
the only struxes allowed are <block>
7. <hdrFtr> strux's must be placed after the main document. No <section>
strux's are allowed after them.
8. No <frame> </frame>, <footnote></footnote>,<endnote</endnote>,
<TOC></TOC> struxs are allowed after <hdrftr>
I think that just about covers things.
Tomas and I have implemented the following "insertBeforeFrag" methods in
pd_Document that allow the insertion of struxs to repair invalid piecetables
bool insertStruxBeforeFrag(pf_Frag * pF, PTStruxType pts,
const XML_Char ** attributes, pf_Frag_Strux ** ppfs_ret = 0);
bool insertSpanBeforeFrag(pf_Frag * pF, const UT_UCSChar * p,
UT_uint32 length);
bool insertObjectBeforeFrag(pf_Frag * pF, PTObjectType pto,
const XML_Char ** attributes);
bool insertFmtMarkBeforeFrag(pf_Frag * pF);
bool insertFmtMarkBeforeFrag(pf_Frag * pF, const XML_Char **
attributes);
6. Every </table> strux must be preceded by a </cell> strux
To aactually implement this repair, one just needs to itterate through
the frags in the document and verify that the rules are always correct.
If they're not insert the appropriate frags or strux to repair it if
possible. eg if the are some content frags in places they shouldn't be,
insert a block before them.
If the document can't be repaired we can just inform the user
of our problem anyway.
Any volenteers to implement this? Have I missed some rules?
Cheers
Martin
Received on Sat Sep 18 15:01:09 2004
This archive was generated by hypermail 2.1.8 : Sat Sep 18 2004 - 15:01:11 CEST