Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0008710opensim[GRID] Asset Servicepublic2020-06-02 06:112020-06-02 11:31
ReporterOrion Pseudo 
Assigned To 
PrioritynormalSeveritycrashReproducibilityrandom
StatusnewResolutionopen 
PlatformLinux / Mono 6.8.0.123OSFedoraOS Version31
Product Version0.9.1.0 
Target VersionFixed in Version 
Summary0008710: Unhandled System.IO.IOException while writing FS Assets (Win32 IO Returned 997)
DescriptionPlease note, I'm using OpenSimulator 0.9.1.1, not 0.9.1.0. 0.9.1.1 is not an option in the product version field.

Occasionally while ROBUST is writing assets, usually during a HG transfer I'll encounter the above mentioned crash (please see screen shot). It doesn't appear that the crash is being written to the log, and ROBUST is simply terminating. Looking up the error code indicates "Overlapped I/O operation is in progress" (https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--500-999- [^]).
Steps To Reproduce1 - Configure ROBUST to use FSAssets in HyperGrid mode.
2 - Teleport to a HyperGrid location and find an object for sale that contains a large number of items.
3 - Buy or copy said item and monitor your Robust console for a crash.
Additional InformationRelevant configuration snippets:

[AssetService]

    ;; Choose an asset service (Only one option should be enabled)
    ;LocalServiceModule = "OpenSim.Services.AssetService.dll:AssetService"
    LocalServiceModule = "OpenSim.Services.FSAssetService.dll:FSAssetConnector"

    ;; FSAsset Directories. Base directory, where final asset files are stored and Spool directory for temp files
    ;; These directories must be on the same physical filesystem
    BaseDirectory = "/assets/data"
    SpoolDirectory = "/asset_spool/tmp"

    ;; Original service can be checked if FSAssets can not find an asset
    ;FallbackService = "OpenSim.Services.AssetService.dll:AssetService";

    ;; How many days since last updating the access time before its updated again by FSAssets when accessing an asset
    ;; Reduces DB calls if asset is requested often. Default value 0 will always update access time
    ;DaysBetweenAccessTimeUpdates = 30

    ;; Should FSAssets print read/write stats to the robust console, default is true
    ;ShowConsoleStats = true

    ;; FSAssets Custom Database Config (Leave blank to use grids default database configuration)
    ;StorageProvider = ""
    ;ConnectionString = ""
    ;Realm = "fsassets"

    ;; The following are common to both the default asset service and FSAsset service

    ;; Common asset service options
    DefaultAssetLoader = "OpenSim.Framework.AssetLoader.Filesystem.dll"
    AssetLoaderArgs = "./assets/AssetSets.xml"

    ; Allow maptile assets to remotely deleted by remote calls to the asset service.
    ; There is no harm in having this as false - it just means that historical maptile assets are not deleted.
    ; This only applies to maptiles served via the version 1 viewer mechanisms
    ; Default is false
    AllowRemoteDelete = false

    ; Allow all assets to be remotely deleted.
    ; Only set this to true if you are operating a grid where you control all calls to the asset service
    ; (where a necessary condition is that you control all simulators) and you need this for admin purposes.
    ; If set to true, AllowRemoteDelete = true is required as well.
    ; Default is false.
    AllowRemoteDeleteAllTypes = false

[HGAssetService]
    ;; Use the second option if you have FSAsset service enabled
    ;LocalServiceModule = "OpenSim.Services.HypergridService.dll:HGAssetService"
    LocalServiceModule = "OpenSim.Services.HypergridService.dll:HGFSAssetService"

    UserAccountsService = "OpenSim.Services.UserAccountService.dll:UserAccountService"

    ; HGAssetService is a public-facing service that allows users to
    ; read and create assets when on another grid. This reuses the general asset service connector.
    ; Hence, if the user has set up authentication in [Network] to protect their private services
    ; make sure it is overriden for this public service.
    AuthType = None

    ;; Can overwrite the default in [Hypergrid], but probably shouldn't
    ; HomeURI = "${Const|BaseURL}:${Const|PublicPort}"

    ;; The asset types that this grid can export to / import from other grids.
    ;; Comma separated.
    ;; Valid values are all the asset types in OpenMetaverse.AssetType, namely:
    ;; Unknown, Texture, Sound, CallingCard, Landmark, Clothing, Object, Notecard, LSLText,
    ;; LSLBytecode, TextureTGA, Bodypart, SoundWAV, ImageTGA, ImageJPEG, Animation, Gesture, Mesh
    ;;
    ;; Leave blank or commented if you don't want to apply any restrictions.
    ;; A more strict, but still reasonable, policy may be to disallow the exchange
    ;; of scripts, like so:
    ; DisallowExport ="LSLText"
    ; DisallowImport ="LSLBytecode"
TagsNo tags attached.
Git Revision or version number
Run Mode Grid (Multiple Regions per Sim)
Physics EngineBulletSim
Script EngineXEngine
EnvironmentMono / Linux64
Mono Version6.x
Viewer
Attached Filespng file icon 0d8998be93c28e4697da49ea9675da00.png [^] (85,518 bytes) 2020-06-02 06:11

- Relationships

-  Notes
(0036521)
tampa (reporter)
2020-06-02 06:28

Doing a quick search seems to indicate this could be a bug within mono itself though none of what I found gives any indication of the status or whether this can be accounted for within an application. Another possibility is that the asset is too large to transfer into the asset server and thus gets truncated and corrupted. Have you set higher limits in your database to allow larger packets to go through? What filesystem is your asset server running on, what distro do you use? Ultimately it could indicate an issue with the drive itself as well.
(0036522)
Orion Pseudo (reporter)
2020-06-02 07:16

@Tampa - The exception is coming from System.IO.FileStream, and from what I can tell it might be complaining that another IO operation is is progress.

I'm not setup to debug the code, but looking through FSAssetService.cs I suspect this block (found under the Writer method) might be where the break is occurring:

                        if (pathOk)
                        {
                            try
                            {
                                byte[] data = File.ReadAllBytes(files[i]);
 
                                using (GZipStream gz = new GZipStream(new FileStream(diskFile + ".gz", FileMode.Create), CompressionMode.Compress))
                                {
                                    gz.Write(data, 0, data.Length);
                                }
                                File.Delete(files[i]);
 
                                //File.Move(files[i], diskFile);
                            }
                            catch(System.IO.IOException e)
                            {
                                if (e.Message.StartsWith("Win32 IO returned ERROR_ALREADY_EXISTS"))
                                    File.Delete(files[i]);
                                else
                                    throw;
                            }
                        }

It looks like the whole thing is structured to run in an infinite loop that checks on each iteration to see if there are any files waiting in the spool folder. Just a hunch, but I wonder if its maybe trying to process a file that's still being transferred from the foreign grid?
(0036523)
UbitUmarov (administrator)
2020-06-02 07:29
edited on: 2020-06-04 03:09

That does seem the block. it only handles the case ERROR_ALREADY_EXISTS, that is expected to happen
the 997 makes no sense. Those operations are suposed to be blocking, ie code waits for them to complete. So the overlaped io in progress makes no sense.
So it also seems a mono bug.
It may also be a damaged file or file system. Try doing a fsck.
also inspect by hand that deep folder path /asset/data/30/...

(0036526)
Orion Pseudo (reporter)
2020-06-02 08:08

@Ubit - At present the assets folder is located on a 500gig m2 drive served from a FreeNAS box. Its mapped to the robust server via CIFS. It looks like the file exists and there is content in there, but beyond that I can't say.

Let me try moving it over to a local ext4 partition to see if that makes a difference and I'll report back.
(0036529)
Orion Pseudo (reporter)
2020-06-02 09:52

@Ubit - It looks like moving the folder over to a local partition resolved the issue. It was able to copy few thousand items over the course of about an hour without throwing the exception. I can only guess this is some sort of a glitch between CIFS / SMB and .net.
(0036530)
UbitUmarov (administrator)
2020-06-02 10:08

OK, thanks for your report. There is nothing we can do on that :(
(0036531)
Orion Pseudo (reporter)
2020-06-02 11:31

Understood and totally thanks for your help! :) Although I have to admit, sifting through this has kind of rekindled the code bug in me. I can't promise anything, but I might take a crack at cleaning this connector up a bit and possibly streamlining a few things.

- Issue History
Date Modified Username Field Change
2020-06-02 06:11 Orion Pseudo New Issue
2020-06-02 06:11 Orion Pseudo File Added: 0d8998be93c28e4697da49ea9675da00.png
2020-06-02 06:28 tampa Note Added: 0036521
2020-06-02 07:16 Orion Pseudo Note Added: 0036522
2020-06-02 07:29 UbitUmarov Note Added: 0036523
2020-06-02 08:08 Orion Pseudo Note Added: 0036526
2020-06-02 09:52 Orion Pseudo Note Added: 0036529
2020-06-02 10:08 UbitUmarov Note Added: 0036530
2020-06-02 11:31 Orion Pseudo Note Added: 0036531
2020-06-04 03:09 aiaustin Note Edited: 0036523 View Revisions


Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker