wow, you've got it all figured out. So instead of fixing the config file, just add more layers of code to catch and paper over the errors.
Well, yes to an extent, for I've been studying it all year--why else would a crazy Polack be installing and running each and every version of each code build available to me? AND, No Martin, not the config... that's the problem. THEY don't fix the config, WE DO.
The false faults I'm talking about can be handled by simple translation of old model data ( Pre-TRS2004, when they added bogeys {}, Thumbnails {}, etc. in the config and later, texture.txt files) with apropos mappings and a short subroutine for 'handling' each old tag. If you doubt it, consider all the assets which are v1-3 to v2-6 which work fine in TS12--they translate AFTER saving--on input to Railyard, Surveyor, Driver, I'm pointing out they should be translating first then saving an optimized form-- which in run-time loading would give large speed/performance benefits and be more robust. The later, because if they can't code it on committing it, they can't load it--which takes away the excuse they are error checking tighter and tighter. THAT measure is as tight as you can possibly get!
And moreover, if they did so, instead of a chump file with compressed data forms which still must be read in then put away, can save the committed file as a binary, an object file with a 'what do I do, and how to treat me header' which they can suck in much faster into a memory block. (a struct or union in C/C#), whose positions are known (The structures their application needs--direct transplants, so to speak). Oh, there are variable length objects, all the textures and text string elements, but again already know how to handle those, the struct or union is the handle and contains the pointers to such in general memory. They've got to handle the data anyway, there is no good reason to do it inefficiently and slowly! BTW, Windwalkr agreed with that assessment last July--but TANE was already underdevelopment, wasn't it? Hopefully, he gave it some attention and will surprise us with more modern approach.
- (Feel free to look up any of these terms in Wikipedia, if unfamiliar, I'm feeling too lazy to link tonight! Sorry.)
Consider misspellings, another class of faults they could easily have 'coded around' (handled) with a heuristic algorithm that simply added the characters values (a hash table), compared that against a lookup table of legal keywords and their hash values, and a test as to whether such a key word was already filled, or remained empty.
They should already have the lookup table, and be tokenizing against it for all key words. Cutting the count in that table is the one slim real rationale that makes some logical sense, but instead of testing against that and returning a token saying this is a load1 product-kuid, they could simply return a 'ignore this line and treat it as a comment' code. We beat up Windwalkr pretty bad last July over eliminating comments (REM) in the config--where he agreed the pre-processing was a feasible approach.
Bottom line, in 2000, using the config as a loadable data source made sense given the state of computer hardware. Retaining that approach in 2005-2006 was far less justifiable, and by HDD's and memory available in 2008 for TS2009, highly questionable. The decision to ride that horse to the death has gone down hill from there, and a lot of the crap the CCs like yourself and those who've hung up their creativity-hats because of all the maintenance (did you sign a contract to be a perpetual slave, or figure that asset made in 2005 was going to be good for a decade or better?) and upgrades to an unnecessarily high base trainz-build spec simply isn't justifiable upon demands of data model evolution. The engine specs changed a lot in the TC's, but aside from that, how many structures (containers, tags) are truly different in function? Oh, some kinds now support different scripting specs--scenery items can no longer use a software class restricted to mocrossings, but that is the kind of change where tightening up a specification is entirely understandable and appropriate. But thumbnail versus thumbnails? bogey versus bogeys? ... Pfffle! What else moved, Ah, smokes, now effects in the mesh table, iirc. I've only fixed a handful of bad tags in those, but nothing I recollect would prevent automated repositioning. They already know what they default, if undefined. They know what they've eliminated (epbrakes, name-XX, old category-XXXXX-nn etc.). Converting each eliminated old form to a new one is and will be simple. Hell, I can do it!
THEY can probably, like PEV, even correct texture names with foreign alphabetical characters because at the basic level, all values in a computer are just numbers--codes given meaning by context, and the mesh IM has that sort of context. Indexed Mesh has, well, an index--a lookup associative list. Apply a similar hash when a texture isn't found--but compare to textures your alphabet restriction does allow, and a couple of odd characters in the sum and a check against the count of the inventory of textures will find such mappings as well. If and when detected, the mesh can be patched with the nearest legal character, and the filename adjusted if necessary. The original mesh retained like the original config.txt as a meshname.org.im file... one of the legal types they bundle into the chump files. AT THE LEAST, such an asset could generate a EXPLICIT error message pointing out the mesh had some French character or Cyrillic character, and so forth.
Then there's the whole body of I can't find the texture issues--I pointed out the absurdity of that above, and they have the code from the TRs and TC's they only need cut out and use as a handler core. Trimming the whitespace onwards after a value inside a quote... I sat down last week and wrote out a parsing function core which:
- Retained and numbered a source line pointer
- detected the trial keyword, and passed it to a function for validation as a key-word, handled a bad return value (misspelling) with a second function call, or
- continued parsing the line (return code indicates what to expect) for a begin-quote or digit until the next-black
- determined whether was inquote, or in number, or began a container (the '{' from C)
- assembled the value until not-inquote, not inline, or iswhite
- handled those if not a block inquote like license and description, finished incrementing to the line's end if in one such.
- and if still inquote and a value, trimmed off the whitespace, closed the quote, and if numeric, called a function to parse that and return a numeric value union with a code defining the type, so returned both.
- which was put away in a line definition struct, whether a reference (kuid), a string value (texture.txt, file name, etc.) or a float, boolean, or integer.
- then loop until the end of line
- then loop until for the next line
- then loop until the file bottom.
All that fits on just about the same number of coded lines as the numbered lines here above, and since it was preprocessing it wasn't doing a lot of processing but value capture and pairing with keyword identifiers, and processing flags, but as a first pass, it will handle comments, descriptions, license, author, etc fields and filenames (which could then be tested as a next step). It also sets flags (line is part of a container, part of a subcontainer, starts a container, starts a subcontainer, starts a quoted block, ends a... etc.) in the struct holding that pre-processed (prepared) line, logs the line number, and packs the line raw data, while retaining the input line unmolested. Gee, not even a screenful of code and the accumulated lines already know which lines to ignore in further processing, and which have container, which have external references (kuids) and which point another file in the folder.
In a good pre-process, nesting would be checked as well, so each line struct and/or raw line would have a pointer to it pushed on a stack, so they can be popped off in reverse order to check parsing from bottom to top. AT THE LEAST, the back read vs Front read will specify the lines with troublesome mismatched ends, and abort the generation of false faults lists before additional evaluation and validation. That eliminates the long lists of false errors from typos and unpaired container ends... speeding the by hand fix, by generating a proper error. Bottom line, they try to do too much with one pass, and cost us time by not taking the time at this input and evaluation stage to eliminate what they could do fairly simply. A screen full of code and an intermediate file compacted and ready to process with the confidence of knowing it's stripped, balanced and what can be ignored, has been.
Let's see, hand edit someone else's typos and hang onto version relevant tags from an earlier era, retaining a historic relation and preserving the ability to make a few minor regressive changes or have the user community spend the their hobby time chasing down trivial errors that could be handled far more efficiently with well designed code...
Hmmmmm. Tough choice here. Incompetence, disrespect for the users time, and unnecessary, needless, or incorrect faults messages
or competence, consideration and efficient effectiveness...with far less curable assets reporting as faulty from the DLS. Which do you prefer? Personally, I think I'm for proper translation of older assets, before requiring human intervention. // Frank