Looking at yesterday's test results is depressing: http://test.winehq.org/data/200710241000/
Just looking at the pretty colors may not make this very obvious, but the state of the tests is APPALLING.
Successes | Failures | Failure rate | Not Run WinXP-1 | 260 | 53 | 17% | 0 WinXP-2 | 252 | 62 | 20% | 0 Win2003-1 | 264 | 49 | 16% | 0 Win2003-2 | 241 | 72 | 23% | 0 Win98 | 113 | 115 | 50% | 45
So depending on the test run and the platform we have a failure rate that is between 16% and over 50%, really 53% as the gdi32 tests should really run on Win98.
This is important because if you try to fix Wine to make these tests pass you may actually end up breaking Wine. So until these tests are fixed and pass on Windows they are of no use to us.
So yeah, we don't have an http://test.winehq.org/ home page, and http://test.winehq.org/data/ should be made nicer too, but even so I'm hoping a few motivated developers will find their way to the test results and start a push to make things better.
Btw, shouldn't it be a Wine 1.0 goal to have our conformance test suite pass cleanly on Windows?
Just looking at the pretty colors may not make this very obvious, but the state of the tests is APPALLING.
Agreed. I wonder how much of it has to do with not noticing that the tests have failed?
I may just be transforming the problem from an easy one (we shouldn't be lazy about checking the test results) to a hard one, but: what about automatically doing a regression test to find the patch that broke the test, and logging a bug for it?
I suspect the biggest problem is keeping the winetest executable up to date on the systems. If the test system can't compile the tests, it can't easily perform a regression test. What's the biggest obstacle to that?
--Juan
Juan Lang wrote:
Just looking at the pretty colors may not make this very obvious, but the state of the tests is APPALLING.
Agreed. I wonder how much of it has to do with not noticing that the tests have failed?
I may just be transforming the problem from an easy one (we shouldn't be lazy about checking the test results) to a hard one, but: what about automatically doing a regression test to find the patch that broke the test, and logging a bug for it?
Amen!!! I have meaning to do this, but I have not been able to find the time.
I suspect the biggest problem is keeping the winetest executable up to date on the systems. If the test system can't compile the tests, it can't easily perform a regression test. What's the biggest obstacle to that?
One could do like Bazaar developers do, they have mailing list robot that snatches patches on their dev list and commits them.
Our robot could build them (on a linux system) and run the resulting winetest.exe on a virtual machined windows.
Then the patch could be blackflagged _before_ it was commited by Alexandre.
regards, Jakob
On Thu, 2007-10-25 at 09:38 -0700, Juan Lang wrote:
I suspect the biggest problem is keeping the winetest executable up to date on the systems. If the test system can't compile the tests, it can't easily perform a regression test. What's the biggest obstacle to that?
There's a lot of machinery needed on a box to rebuild wine, and Windows boxes typically have no development tools whatsoever.
There's a lot of machinery needed on a box to rebuild wine, and Windows boxes typically have no development tools whatsoever.
Okay, but the toolchain to build winetest is relatively small, isn't it? Could we include that in the Windows version of the tests in order to speed up our response to a test that fail in Windows? --Juan
"Juan Lang" juan.lang@gmail.com writes:
There's a lot of machinery needed on a box to rebuild wine, and Windows boxes typically have no development tools whatsoever.
Okay, but the toolchain to build winetest is relatively small, isn't it? Could we include that in the Windows version of the tests in order to speed up our response to a test that fail in Windows?
There's no real reason to speed this up; Windows is not changing, unlike Wine, so we don't have to catch regressions, it's fine to come back and fix tests after the fact.
If we require tests to pass on all Windows versions before getting committed it will drastically reduce the number of tests accepted, with little benefit. In most cases tests fail on some Windows boxes because they are too strict in the behavior they expect, and that's not a problem for us.
The only cases that we should really worry about are tests that fail the same way on all Windows versions, because it shows that the test is expecting the wrong thing.
Alexandre Julliard wrote:
If we require tests to pass on all Windows versions before getting committed it will drastically reduce the number of tests accepted, with little benefit. In most cases tests fail on some Windows boxes because they are too strict in the behavior they expect, and that's not a problem for us.
Except that the tests clutter up the reports. We should have at least one dedicated, declared sane, Windows installation that we regard as most important. When you _start_ expecting tests to fail is when you _stop_ paying attention to tests.
(That Codeweavers do not have such an installation yet, is beoynd me. Or if you do, please make it automatically submit its findings to test.winehq.org!)
The only cases that we should really worry about are tests that fail the same way on all Windows versions, because it shows that the test is expecting the wrong thing.
Cleaning up these tests to not expect the wrong thing should lead to a deeper understanding of desired behaviour of Wine.
regards, Jakob
Jakob Eriksson wrote:
Alexandre Julliard wrote:
If we require tests to pass on all Windows versions before getting committed it will drastically reduce the number of tests accepted, with little benefit. In most cases tests fail on some Windows boxes because they are too strict in the behavior they expect, and that's not a problem for us.
Except that the tests clutter up the reports. We should have at least one dedicated, declared sane, Windows installation that we regard as most important. When you _start_ expecting tests to fail is when you _stop_ paying attention to tests.
(That Codeweavers do not have such an installation yet, is beoynd me. Or if you do, please make it automatically submit its findings to test.winehq.org!)
We do. I've got a machine that regularly runs the test on Windows 2003 on real hardware: http://test.winehq.org/data/200710241000/2003_rshearman/report
However, the tests are run by a service rather than manually by me to reduce the effort needed. This results in some tests failing when they perhaps shouldn't.
This also doesn't help the Direct3D developers since D3D is disabled by default on Windows 2003, so we also need a real Windows XP box to run the tests regularly.
Robert Shearman wrote:
We do. I've got a machine that regularly runs the test on Windows 2003 on real hardware: http://test.winehq.org/data/200710241000/2003_rshearman/report
That's excellent!
However, the tests are run by a service rather than manually by me to reduce the effort needed. This results in some tests failing when they perhaps shouldn't.
I have found that scripting a wget that runs periodically works good, with winetest in "Desktop mode". Maybe something for that server?
This also doesn't help the Direct3D developers since D3D is disabled by default on Windows 2003, so we also need a real Windows XP box to run the tests regularly.
True...
regards, Jakob
Robert Shearman wrote:
Jakob Eriksson wrote:
[...]
(That Codeweavers do not have such an installation yet, is beoynd me. Or if you do, please make it automatically submit its findings to test.winehq.org!)
We do. I've got a machine that regularly runs the test on Windows 2003 on real hardware: http://test.winehq.org/data/200710241000/2003_rshearman/report
And I have recently put together a script for running winetest in VMware virtual machines unattended (see my other post). So going forward I will be running it on Windows 98, Windows XP and Windows 2003 nightly.
Francois Gouget wrote:
And I have recently put together a script for running winetest in VMware virtual machines unattended (see my other post). So going forward I will be running it on Windows 98, Windows XP and Windows 2003 nightly.
This is soooo good. test.winehq.org will became several times more useful than before. A consistent track record from the same installations of Windows.
regards, Jakob
On 27/10/2007, Jakob Eriksson jakob@vmlinux.org wrote:
Francois Gouget wrote:
And I have recently put together a script for running winetest in VMware virtual machines unattended (see my other post). So going forward I will be running it on Windows 98, Windows XP and Windows 2003 nightly.
This is soooo good. test.winehq.org will became several times more useful than before. A consistent track record from the same installations of Windows.
I agree.
It would be even better if the tests were also run on real machines, as that would catch which test failures are VM related (such as the Direct3D tests).
- Reece
On Sun, 28 Oct 2007, Reece Dunn wrote: [...]
It would be even better if the tests were also run on real machines, as that would catch which test failures are VM related (such as the Direct3D tests).
Sure. However I don't have real Windows machines so I'll leave this as an exercise for someone else. The tricky part is scripting the download, signature checking and getting it all to run automatically at night (because if it's not 100% automated I don't think we'll have regular results). Maybe the way to go is with the Windows scheduler plus a wsh script. Or winetest could be modified to do most of it: with wininet/winhttp/urlmon managing the download part should be quite feasible, while crypt32 might provide for the signature checking part. Then it's just a matter for winetest to remain idle (with a systray icon?) until the specified time to run the tests...
Actually, I'll mention that Virtual machines do have an advantage over real Windows machine: after the tests are run, they are reset back to a known state, even if the tests did not clean things up quite right.