There are lots of bugs filed for little (or big) problems in wine's cmd that involve parsing the command language:
21047 cmd does not handle for %%a in ('command') 21046 cmd does not handle all operators in 'if' command 20161 cmd can't handle echo commands containing quotes and redirection 19784 cmd doesn't handle ( ) scoping; breaks firefox build 18712 cmd: "if defined ... " command crashes. 18407 cmd.exe: set command seems broken 18346 cmd does not support the "^" escape character 18057 cmd.exe: mishandled quoted built-in commands with parameters (programs/cmd/wcmdmain.c: has_space==1 && opt_s==0) 15359 cmd's "for" command doesn't handle /F. Breaks msysgit, firefox build.
The existing parser in cmd is pretty weak. I wonder if it might not be time to write a new parser -- either using some parser generator tool, or more likely, a plain old recursive descent parser -- that can handle the language better.
Also, some of the builtins need to be split out into separate executables, see e.g. 18059 - Unity Indie Trial exits because attrib.exe is missing (which might relax the requirements on the parser inside cmd slightly)
On Thu, Dec 17, 2009 at 1:57 AM, Dan Kegel dank@kegel.com wrote:
There are lots of bugs filed for little (or big) problems in wine's cmd that involve parsing the command language: ... The existing parser in cmd is pretty weak. I wonder if it might not be time to write a new parser -- either using some parser generator tool, or more likely, a plain old recursive descent parser -- that can handle the language better.
I wonder if http://www.antlr.org/ would be overkill for generating the parser. It does support generating C... might be a fun thing to try, anyway. ANTLR does seem to be the most popular parser generator these days.
2009/12/17 Dan Kegel dank@kegel.com:
I wonder if http://www.antlr.org/ would be overkill for generating the parser. It does support generating C... might be a fun thing to try, anyway. ANTLR does seem to be the most popular parser generator these days.
That needs Java.
On Thu, Dec 17, 2009 at 2:43 AM, Henri Verbeet hverbeet@gmail.com wrote:
2009/12/17 Dan Kegel dank@kegel.com:
I wonder if http://www.antlr.org/ would be overkill for generating the parser. It does support generating C... might be a fun thing to try, anyway. ANTLR does seem to be the most popular parser generator these days.
That needs Java.
But only when generating the parser. It can output a pure C parser.
If it's not ok to require Java to build (and I could imagine that), we could check in ANTLR's output. Then only people hacking on cmd would need java installed.
2009/12/17 Dan Kegel dank@kegel.com:
On Thu, Dec 17, 2009 at 2:43 AM, Henri Verbeet hverbeet@gmail.com wrote:
That needs Java.
But only when generating the parser. It can output a pure C parser.
If it's not ok to require Java to build (and I could imagine that), we could check in ANTLR's output. Then only people hacking on cmd would need java installed.
Well, at least Alexandre as well, and I could imagine distributions wanting to generate those files themselves as well.
IIRC ANTLR also includes its own IDE/editor, although it's probably not required to use that. It seems to me like it would be less trouble to either use bison/yacc or write something by hand, like most of the other parsers in Wine. The more interesting problem is of course finding someone to work on it in the first place.
On Thu, Dec 17, 2009 at 3:05 AM, Henri Verbeet hverbeet@gmail.com wrote:
If it's not ok to require Java to build (and I could imagine that), we could check in ANTLR's output. Then only people hacking on cmd would need java installed.
Well, at least Alexandre as well, and I could imagine distributions wanting to generate those files themselves as well.
Sure. And they can do 'apt-get install antlr3' or the equivalent pretty easily.
IIRC ANTLR also includes its own IDE/editor, although it's probably not required to use that.
You certainly don't need to use any ide.
It seems to me like it would be less trouble to either use bison/yacc or write something by hand, like most of the other parsers in Wine. The more interesting problem is of course finding someone to work on it in the first place.
The two problems are not unrelated. It's possible that finding somebody willing to use antlr would be easier than finding somebody willing to use yacc or write it from scratch.
I'm also biased towards hand-written parsers, but the cmd language is big and hairy; it's tempting to try prototyping the parser with antlr before diving in to write it by hand. And who knows, maybe the antlr version would suffice.
I'm trying to build "hello, antlr" now as a reality check. ... ok. Here's what I had to do for a java example on jaunty:
Copy and paste the grammar and test program from http://www.antlr.org/wiki/display/ANTLR3/Expression+evaluator into Expr.g and Test.java
sudo apt-get install antlr3 antlr3 Expr.g javac -cp .:/usr/share/java/antlr3-3.0.1+dfsg.jar Test.java ExprLexer.java ExprParser.java
It originally gave me the error ExprParser.java:163: warning: [unchecked] unchecked call to put(K,V) as a member of the raw type java.util.HashMap memory.put(ID2.getText(), new Integer(expr3)); ^ but that just meant the example is for old java; I had to update the line HashMap memory = new HashMap(); to HashMap<String,Integer> memory = new HashMap<String,Integer>();
Then all went tickety-boo, no gui involved, and the demo worked.
Now to try the same thing in C. But sleep first.
2009/12/17 Dan Kegel dank@kegel.com:
The two problems are not unrelated. It's possible that finding somebody willing to use antlr would be easier than finding somebody willing to use yacc or write it from scratch.
For a potential new contributor possibly, but I doubt that's true for most existing Wine contributors. My personal experience is that most people I know that use or used ANTLR are Java developers rather than C developers though. I did use ANTLR v2.x myself in the past (I wrote some Java too, hope it doesn't show too much ;)), although more in a maintenance kind of way. Seemed to work well enough, although I don't remember it having compelling features over bison/yacc.
On Thu, Dec 17, 2009 at 4:03 AM, Henri Verbeet hverbeet@gmail.com wrote:
2009/12/17 Dan Kegel dank@kegel.com:
The two problems are not unrelated. It's possible that finding somebody willing to use antlr would be easier than finding somebody willing to use yacc or write it from scratch.
For a potential new contributor possibly, but I doubt that's true for most existing Wine contributors.
I'm thinking of college students :-)
My personal experience is that most people I know that use or used ANTLR are Java developers rather than C developers though. I did use ANTLR v2.x myself in the past (I wrote some Java too, hope it doesn't show too much ;)), although more in a maintenance kind of way. Seemed to work well enough, although I don't remember it having compelling features over bison/yacc.
Multilanguage support seems pretty compelling (though it's kind of limited, since one usually intersperses target language code in amongst the grammar...) - Dan
Dan Kegel a écrit :
On Thu, Dec 17, 2009 at 4:03 AM, Henri Verbeet hverbeet@gmail.com wrote:
2009/12/17 Dan Kegel dank@kegel.com:
The two problems are not unrelated. It's possible that finding somebody willing to use antlr would be easier than finding somebody willing to use yacc or write it from scratch.
For a potential new contributor possibly, but I doubt that's true for most existing Wine contributors.
I'm thinking of college students :-)
My personal experience is that most people I know that use or used ANTLR are Java developers rather than C developers though. I did use ANTLR v2.x myself in the past (I wrote some Java too, hope it doesn't show too much ;)), although more in a maintenance kind of way. Seemed to work well enough, although I don't remember it having compelling features over bison/yacc.
Multilanguage support seems pretty compelling (though it's kind of limited, since one usually intersperses target language code in amongst the grammar...)
- Dan
we don't need to support several grammar compiler in wine yacc/bison is way sufficient but I agree that using a real grammar to rewrite cmd would be a real gain. current code is unmaintainable as it is
A+
On Thu, Dec 17, 2009 at 12:06 PM, Eric Pouech eric.pouech@orange.fr wrote:
we don't need to support several grammar compiler in wine yacc/bison is way sufficient but I agree that using a real grammar to rewrite cmd would be a real gain. current code is unmaintainable as it is
For what it's worth, antlr generates recursive descent parsers, so the resulting code is a lot easier to read than the code generated by yacc.
antlr's c runtime seems huge, though, so I rather doubt we would use it. - Dan
On Thu, Dec 17, 2009 at 2:07 PM, Dan Kegel dank@kegel.com wrote:
On Thu, Dec 17, 2009 at 12:06 PM, Eric Pouech eric.pouech@orange.fr wrote:
we don't need to support several grammar compiler in wine yacc/bison is way sufficient
yow, we already have eight parsers written in bison in wine.
bison it is, then.
On Thu, Dec 17, 2009 at 8:32 PM, Dan Kegel dank@kegel.com wrote:
bison it is, then.
I'm thinking of having some ucla students do this, and it occurred to me that using "ply" might be a good way to start. ply is a nice lex/yacc implementation written in python that sounds good for prototyping. Once the grammer is far enough along, it'd be pretty easy to switch to c.
Maybe I'll noodle around with ply and see if I can implement a subset of cmd's language with it. Here's a tiny toy start: http://kegel.com/wine/cmd.py
ply home page: http://www.dabeaz.com/ply/ example of use of ply in compiler course: http://ecee.colorado.edu/~siek/ecen4553/fall08/hw2.pdf
On Fri, Dec 18, 2009 at 5:00 PM, Dan Kegel dank@kegel.com wrote:
Maybe I'll noodle around with ply and see if I can implement a subset of cmd's language with it. Here's a tiny toy start: http://kegel.com/wine/cmd.py
ply home page: http://www.dabeaz.com/ply/ example of use of ply in compiler course: http://ecee.colorado.edu/~siek/ecen4553/fall08/hw2.pdf
Oh, and http://ss64.com/nt/syntax.html seems like a nice summary of the language. - Dan
On Fri, Dec 18, 2009 at 8:01 PM, Dan Kegel dank@kegel.com wrote:
Oh, and http://ss64.com/nt/syntax.html seems like a nice summary of the language.
This suggestion will most likely get null'd as it has before but since there are still patches accepted by ReactOS developers from time to time, it would seem to make more since to me to just prepackage a winelib build or binary build of their cmd.exe somewhere. It's mostly feature complete and is derived from FreeDOS command.com and a large chunk of the development work on it was done by former CodeWeaver Eric Kohl.
We (reactos developers) toyed with the idea of spawning some of the subprojects off in the past, so perhaps we could create a new sourceforge project, something along the lines of FreeCMD and move the project and revision history there. Linking to it as a separate project package is not that much of a stretch from how we do things with Gecko.
Having a comprehensive test suite written for the scripting language makes sense but the duplicated effort to re-implement the whole app never has given there are already shared tools in the winehq tree such as regedit and taskmgr unless you are ready to propose that they be ripped out.
On Fri, Dec 18, 2009 at 6:07 PM, Steven Edwards winehacker@gmail.com wrote:
This suggestion will most likely get null'd as it has before but since there are still patches accepted by ReactOS developers from time to time, it would seem to make more since to me to just prepackage a winelib build or binary build of their cmd.exe somewhere.
No, I'd rather Wine be completely source-based. Binary blobs don't seem right, and as you know, we can't share ReactOS source code.
It's mostly feature complete and is derived from FreeDOS command.com and a large chunk of the development work on it was done by former CodeWeaver Eric Kohl.
For what it's worth, I've looked at the FreeDOS command.com, and its parser doesn't seem much better than what wine's cmd already has. http://freedos.svn.sourceforge.net/viewvc/freedos/freecom/trunk/shell/ I'm looking for something a little more solid.
Having a comprehensive test suite written for the scripting language makes sense
Yeah. - Dan
We (reactos developers) toyed with the idea of spawning some of the subprojects off in the past, so perhaps we could create a new sourceforge project, something along the lines of FreeCMD and move the project and revision history there.
Has also relicensing such a spin-off been discussed? (Don't want to impose anything, I'm just curious here.)
Regards,
Wolfram
On Sat, Dec 19, 2009 at 12:33 PM, Wolfram Sang wolfram@the-dreams.de wrote:
Has also relicensing such a spin-off been discussed? (Don't want to impose anything, I'm just curious here.)
Not specifically the case of cmd but we've been discussing it regarding large parts of the project as a whole. A dual or tri-license solution much like what the Mozilla project does is preferable but nothing is set in stone yet. If there is a legitimate need for any part to be relicensed most of the developers are accommodating. Email me privately if you have a specific request.
Thanks
Dan Kegel a écrit :
On Thu, Dec 17, 2009 at 8:32 PM, Dan Kegel dank@kegel.com wrote:
bison it is, then.
I'm thinking of having some ucla students do this, and it occurred to me that using "ply" might be a good way to start. ply is a nice lex/yacc implementation written in python that sounds good for prototyping. Once the grammer is far enough along, it'd be pretty easy to switch to c.
Maybe I'll noodle around with ply and see if I can implement a subset of cmd's language with it. Here's a tiny toy start: http://kegel.com/wine/cmd.py
ply home page: http://www.dabeaz.com/ply/ example of use of ply in compiler course: http://ecee.colorado.edu/~siek/ecen4553/fall08/hw2.pdf
beware that the integration of the parser in the current source code (ie splitting in several small patches) can be tedious so I'd rather see: - implement the parser to be usable command by command (ie from first word, to trigger either the old parser or the yacc one, or something like that) - when all commands are done, then switch to the global parser mode
I don't think writing the grammar itself will be very complicated, but wiring all the code to the grammar will be more tedious
A+
On Fri, Dec 18, 2009 at 11:47 PM, Eric Pouech eric.pouech@orange.fr wrote:
beware that the integration of the parser in the current source code (ie splitting in several small patches) can be tedious so I'd rather see:
- implement the parser to be usable command by command (ie from first word,
to trigger either the old parser or the yacc one, or something like that)
- when all commands are done, then switch to the global parser mode
I was thinking of putting the new implementation in programs/cmd2 or something, but I like your idea better.
I don't think writing the grammar itself will be very complicated, but wiring all the code to the grammar will be more tedious
Yeah. - Dan
I'm getting ready to propose this as a project for students at UCLA. See http://kegel.com/wine/sweng/2010/ Comments welcome (especially from anyone who knows our current cmd implementation).
Dan Kegel a écrit :
I'm getting ready to propose this as a project for students at UCLA. See http://kegel.com/wine/sweng/2010/ Comments welcome (especially from anyone who knows our current cmd implementation).
you could add a couple of references to cmd interpretations: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-... http://ss64.com/nt/ (especially the syntax part) http://technet.microsoft.com/en-us/library/cc723564.aspx#mainSection http://en.wikipedia.org/wiki/Batch_file
in task#1, - the .bat (and others) files should be embedded in the .c files and generated on the fly for better inclusion - this has to be done in the most generic way (there's no test env for programs right now)
in task#3, - s/wineqa/winehq/ - if we go for a yacc/bison grammar, it's definitively a good idea to implement it command by command (as a way to split in small patches)
A+
On Wed, Dec 23, 2009 at 12:22 AM, Eric Pouech eric.pouech@orange.fr wrote:
you could add a couple of references to cmd interpretations:
in task#1,
- the .bat (and others) files should be embedded in the .c files and
generated on the fly for better inclusion
- this has to be done in the most generic way (there's no test env for
programs right now)
in task#3,
- s/wineqa/winehq/
- if we go for a yacc/bison grammar, it's definitively a good idea to
implement it command by command (as a way to split in small patches)
Done, thanks.