P3D v5 Water detection observations and UE's

Clutch Cargo · 18 Sep 2023

"Always worth a try"

I agree. So what would you like/need? I can send you an image and its matching water I believe in polygon and/or tiff format (it's been awhile, so I will have to look). Each image is from 400MB (1m), 1GB(60cm) to 4GB(30cm). Geographic areas vary, mostly Southern California but I do have areas in Washinton, Colorado, Arizona and New Mexico. I am assuming the more variety the better.

Does the resolution make a difference or what matters more is the matching water? Do you even need the image? If you need images as well, would smaller TFE sample images work just as well (thinking of sizes/quantity to upload and work with)?

I guess my last question... would uploading a single image & water suffice for an initial test? Would you be able to run a test on a single image and see if that works? Just don't want to upload 100's of GB if it does not work.

arno · 19 Sep 2023

Hi,

I would need both the image and the water mask. To train the algorithm you need both of them.

We can try with one or two images. I'll probably have to split them into parts as the training is done on smaller samples, but that's no issue.

More variation in geographical area is indeed better. Resolution does not matter too much, I think I used 60 cm before to train, but 1 meter should also do.

Clutch Cargo · 19 Sep 2023

I have uploaded a first round of sample data. Check your email for the link. It is in the state of Washington, USA where I was having the most issues with water detection so I hope this will serve as a good test. Lots of rapids in the water so a mixture of colors and lots of greenery on the banks of the river. The waterpolys and matching images were basically first run through TFE, and then I went back and hand-annotated the water to match the imagery. Not super precise but maybe good enough for the first test. I have better hand-annotated water matching water almost down to the pixel which we can try next.

Included in download:
An overall screenshot of the area
small samples images that I used in TFE (300+)
shapefiles of the water
geotiffs of the same shapefiles (has WP in the filename for WaterPoly)
Large geotiff images at 60cm, 4-band

arno · 20 Sep 2023

Thanks, I'll have a look.

arno · 26 Sep 2023

Hi,

I finally had the time to look at the data. The water masks are not super accurate I see, they do not include all the river water and also I see quite some false positives on land where the mask indicates water. So I am not sure how well I can train an algorithm based on this data, but I can give it a try.

Clutch Cargo · 26 Sep 2023

I can supply more accurate data. Let me look. I have some older 1m data but the water was all hand-annotated almost down to the pixel. I will also look for some better 60 cm data with better water. So if you fell it's not worth even to try with that data, don't waste your time.

On a different note, I d/l your latest version. Not sure if you did any tweaking on water side but I seemed to notice some differences... some good -some not. So I did not experience any UE's this time around but I have had scenProc simply close down while I was saving some points. Not sure if that is better or worse than an UE but of course there is not data to send you when this happens. I took an older TF2, deleted the sample images, deleted the points but kept the steps in place. Added new samle images and began to add points. When it was saving, that's when it closed down. Does leaving the steps contribute to the issue? Should I start fresh with no steps?

So I testes an area where there was no UE and rand the area. I then dropped the poly result into QGIS to have a look. At a far away view I grabbed a rectangle of area and saw lots and lots of tiny "x's" which made me think "oh, those must be water created from tree shadows again". But I did a closer look and found out they were actually water in swimming pools! The area I was testing has 100's of them, but I did not see any on shadows like I did before. This made me think did you do a little work on this or maybe I was just better at creating my TFE file?

Clutch Cargo · 27 Sep 2023

I did digging and found lots of data for Southern California with good matching water tiffs. However, the images are 3-band not 4. Will these still work or you must have 4-band. My only other option is to create some new 4-band areas and hand-annotate the water.

If I did that how many samples would you need? I guess more is better so we are talking 100's/1,000's!? Maybe the sample source could be brought down by determining the type of water data we need... the types of water that have the most problems with detection I would think like dark shadows over water, white rapids, different colors of water? Maybe the samples do not have to be so large but pin-pointing where the certain waters are? Thoughts?

arno · 28 Sep 2023

Hi,

Clutch Cargo said:
I can supply more accurate data. Let me look. I have some older 1m data but the water was all hand-annotated almost down to the pixel. I will also look for some better 60 cm data with better water. So if you fell it's not worth even to try with that data, don't waste your time.

Let's first give this data a quick try, before you spend more time on looking for data.

Clutch Cargo said:
On a different note, I d/l your latest version. Not sure if you did any tweaking on water side but I seemed to notice some differences... some good -some not. So I did not experience any UE's this time around but I have had scenProc simply close down while I was saving some points. Not sure if that is better or worse than an UE but of course there is not data to send you when this happens. I took an older TF2, deleted the sample images, deleted the points but kept the steps in place. Added new samle images and began to add points. When it was saving, that's when it closed down. Does leaving the steps contribute to the issue? Should I start fresh with no steps?

I did not make many changes to the texture filter editor and I have only fixed one of the bugs you reported before by now. The others are still on the todo list.

I don't think you need to start with a fresh texture filter, that should not matter in general.

Clutch Cargo said:
So I testes an area where there was no UE and rand the area. I then dropped the poly result into QGIS to have a look. At a far away view I grabbed a rectangle of area and saw lots and lots of tiny "x's" which made me think "oh, those must be water created from tree shadows again". But I did a closer look and found out they were actually water in swimming pools! The area I was testing has 100's of them, but I did not see any on shadows like I did before. This made me think did you do a little work on this or maybe I was just better at creating my TFE file?

Must be that you did a better selection of sample points if you used the SVM filter again this time.

Clutch Cargo said:
I did digging and found lots of data for Southern California with good matching water tiffs. However, the images are 3-band not 4. Will these still work or you must have 4-band. My only other option is to create some new 4-band areas and hand-annotate the water.

I think 3 band imagery will not give such accurate results, the 4th band makes it a lot easier to recognise water. So I don't think it's a good idea to try with 3 band data.

Clutch Cargo said:
If I did that how many samples would you need? I guess more is better so we are talking 100's/1,000's!? Maybe the sample source could be brought down by determining the type of water data we need... the types of water that have the most problems with detection I would think like dark shadows over water, white rapids, different colors of water? Maybe the samples do not have to be so large but pin-pointing where the certain waters are? Thoughts?

Much more than that. The tests I did before used tenthousands of sample images. Like I mentioned before each sample is relatively small (e.g. 512x512 pixels). But there need to be many of them. Including different water types, but also including areas without water. You really need many samples for each conditions to train the algorithm well.

Clutch Cargo · 30 Sep 2023

Thx for all the replies. Ok , what do you think of this idea... If I can get it to work. My waterpolys in southern California are my most accurate as the were hand created. I am thinking of downloading new 60cm 4-band imagery for the same area that only had 3-band. I would have to check the results but I am hoping the water has not changed that much in time (could be just self wishing).

If and when this new technique get implimented, would the user still have the option to add points? I would think that be a good idea to help refine certain areas if needed. Another idea off the top of my head is a way for scenProc to add to its water detection database as users use their own imagery and TFE. I can see pros and cons on this as bad data in would result in bad data out but could also contribute to 1000's of samples from all over the world as more simmers use it. Maybe two databases... the first that scenProc comes with (most accurate from your data), and then a 2nd database that expands with users input? Just thinking out loud here.

Also, is it just as important to supply samples of "non-water" images?

I think I will go back and hand-annotate the first sample I sent you to at least use as a test.

Clutch Cargo · 11 Oct 2023

Arno, I have made available a new watermask for you to try. It is taken from the exact image I sent you before, WA_Section_47_R39. This time is was all hand-annotated. If this test works I can send you the other image R40 in a WM file. I am still working on sending a mass quantity of samples if my idea works.

I take it it non-water areas are just as important to supply for you? Or does the algorithm only reacts to what is water and assumes all else in non-water?

The link for this d/l is: https://drive.google.com/file/d/1ryU81V-cJ6YR904WbRJ5-CiAl_ciSvvW/view?usp=sharing

arno · 12 Oct 2023

Hi,

I have been so busy recently that I haven't made progress on this yet.

Yes, non water areas are just as important, as the algorithm has to learn what is water and what is not.

Clutch Cargo · 12 Oct 2023

Hay, no problem. As you said, the previous image had so many false-positives. Hopefully, this will have better results. Let me know the results and creating the 2nd image would be helpful. In the meantime, I am working, as mentioned, getting many more samples. Thx.

arno · 13 Oct 2023

I'll check the new image you provided.

Clutch Cargo · 26 Oct 2023

I performed some tests on downloading new 4-band imagery and attempting to matching it up with old watermasks of around nine years ago. Too bad. The water levels were different enough where it would require probably a lot of touch up work to match the new imagery. Probably better to start from scratch with the latest imagery. I am willing to put in dozens (if not literally hundreds of hours over time), to hand create true matching watermasks to provide as water detection samples if you are game? Granted it would take time but I could concentrate in areas that we need most and if TFE would still allow samples to be added as well as using a database, we could build up that data base as time went making it more and more accurate.

It really depends on the latest sample I sent you. If that works then I would supply more like that. Thoughts?

arno · 27 Oct 2023

Hi,

Let me first see what I can do with the current images, before you spend a lot of time on additional images. I have been busy with work, family and MCX recently, so I haven't had the time yet to look at it.

The way the deep learning algorithm works means that as an user you can't just add some extra sample images and directly use them. Then the entire training of the algorithm will all the training data would have to be run again.

Clutch Cargo · 27 Oct 2023

Yea, no problem, I know you have been busy. Interesting, so it sounds like the AI needs to be locked down with a given number of samples in order to work without having to "re-learn" with new samples. I guess I was hoping that as I and perhaps others use TFE it would continue to learn and get better.

One idea... a scenProc provides a monthly, quarterly or whatever update which would include an updated water database upon the collection of more samples. 2nd idea... so I typically create a new TFE tf2 file about every 30 square miles (78 square kilometers). I have never thought of continuing to build upon each area adding 100's and 100's of data points. I think that is I have always feared TFE would crash (just based on my experience), plus I would think the processing time of adding 1000's of points and image samples would drag down TFE to its knees and take up to hours to update and save a tf2? Am I correct on that assumption?

Yet another idea would be to submit to you finished tf2 files, samples and shapes. Rather than creating samples from scratch these would be sort a refined tf2 having gone through TFE once and then manually cleaning up what was wrong.

Of course I will wait until you look at what I have supplied before going further. Thx.

arno · 27 Oct 2023

Clutch Cargo said:
Yea, no problem, I know you have been busy. Interesting, so it sounds like the AI needs to be locked down with a given number of samples in order to work without having to "re-learn" with new samples. I guess I was hoping that as I and perhaps others use TFE it would continue to learn and get better.

The problem is the learning takes very long and requires special software. So that's why it makes no sense to let the user run it again. On a CUDA enabled GPU it can take a day or two, only non-CUDA GPU's much longer. So I don't think an user would like to run that just to use the TFE.

Clutch Cargo said:
One idea... a scenProc provides a monthly, quarterly or whatever update which would include an updated water database upon the collection of more samples. 2nd idea... so I typically create a new TFE tf2 file about every 30 square miles (78 square kilometers). I have never thought of continuing to build upon each area adding 100's and 100's of data points. I think that is I have always feared TFE would crash (just based on my experience), plus I would think the processing time of adding 1000's of points and image samples would drag down TFE to its knees and take up to hours to update and save a tf2? Am I correct on that assumption?

Sure, if the algortithm have been improved it would be possible to share the trained model again in a scenProc does.

The geographical area on which you run the TFE does not have a direct relation with the number of samples and sample points. You would only need to add more samples if you have different features on there that need to be included in the selection logic. So you should not have hundreds of sample images in most cases.

Clutch Cargo said:
Yet another idea would be to submit to you finished tf2 files, samples and shapes. Rather than creating samples from scratch these would be sort a refined tf2 having gone through TFE once and then manually cleaning up what was wrong.

Not sure how that would help. The results of traning the water algorithm are not stored in the TF2 file. That is only for the SVM step that the results are stored there.

P3D v5 Water detection observations and UE's

Clutch Cargo

arno

Administrator

Clutch Cargo

arno

Administrator

arno

Administrator

Clutch Cargo

Clutch Cargo

arno

Administrator

Clutch Cargo

Clutch Cargo

arno

Administrator

Clutch Cargo

arno

Administrator

Clutch Cargo

arno

Administrator

Clutch Cargo

arno

Administrator