Processing Elements

This article describes the role of the Access Layer inside the GRID with regards to processing content. The intended audience is GRID developers, administrators or everyone else needing to understand the internals of the GRID processing.

Concept Overview

Overview Diagram

Standard Flow

In the standard flow, the Access Layer initiates processing by sending an initial "Process Message" into the message bus queue that the initial process module is listening to. After processing & app intelligence produced a result the result is sent back to the Access Layer by the module that holds the final result using the SOAP service method storeProcessingResultsAndFinalizeJobs(...).

Notes:

  • The Access Layer collects and stores all input from external sources like Harvesters before processing starts. While the information is managed in a transaction safe way it is written through to the CoreDB once the "Initial Process Message" was sent.

    This is important with regards to timing issues that would be more likely if source information would get updated after the complete processing finished.

  • By calling the service method startJob(...) the Access Layer sends the "Initial Process Message" into the message bus. One of the processing modules needs to pick it up and start processing on it.
  • The processing modules are responsible to update the progress or failure of a processed Job by using the SOAP service method updateJob(...) that allows to send information on the current progress or change the job state if applicable.
  • As mentioned above, the last module in the chain should finalize a job by sending the process results back to the Access Layer. For the case that the processing failed and no result could be produced, the module that failed processing should update the referring Job to avoid having orphaned entries.
  • Jobs are managed inside the CoreDB and long running Jobs are supported. Multiple Access Layer instances share the Job details among each other and are aware of all Jobs running inside one GRID site.

Non Standard Flows

Using this concept, non standard flows can be supported as well. The diagram above shows 2 possible entry points to insert pre-processed data:

  • Importing Legacy Data: Existing data may be imported into the standard GRID work flow by assembling and sending "Process Messages" into the relevant "queue" inside the message bus. In the example above this is the queue that the "App Intelligence" processes are listening to, however it can be inserted in any other step.

    Note: By 1st July 2010, many legacy processes are still in use. The agent that is capable of translating legacy data into the "Process Messages" may also be used inside the normal processing work flow.

  • Multisite Data Feed: "Process Messages" that went through a full processing cycle may not only be sent to one GRID site, instead they can also be multiplexed to multiple sites. This is possible because the Access Layer resolves conflicts automatically and runs de-duplication on the data that is sent against the interfaces. See Multisite Processing for more information on this topic.

Process Message Format

The message format used for processing is based on Schema validated XML that describes the full processing life cycle. This means a single format is used to initiate the process and collect the final feedback. Every processing step should just add the missing bits and pieces or remove intermediate information (if applicable).

In a pure message driven environment, no databases are required to buffer intermediate information for exchanging package related content from one processing step to another. However a process related database may still be needed to exchange non-package related content.

Initiate Processing

The following message shows the content that is sent by the access layer in order to initiate processing. The Access Layer sends this message against the "incoming" queue of the processing message bus. Once sent, the processing system uses this as input to initiate all further tasks.

Initial Process Message
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
<ns3:processPackageDataSet xmlns:ns2="http://grid.trendmicro.com/metadata/1.0"
 processPriority="4" processType="standard" processingSite="urn:trendmicro:grid:processing-site:acl-node:10.52.66.125"
 processingStarted="2014-02-10T01:05:56.315-08:00" referringJob="652fba6c-cd70-4883-b77a-00662ad08607"
 xsi:schemaLocation="http://grid.trendmicro.com/services/level-0 process-package-data-set.xsd">
        <sourceInformation temporary="false">
            <identifier sha1="5976FE464F419C46E23A80A46318FA296B368AB2"/>
            <lastModified>2014-02-10T01:05:56.315-08:00</lastModified>
            <contentTag>--sdfst23sdf53</contentTag>
        </sourceInformation>
        <sourceDomain name="public.domain">
            <ns2:metadata>
                <ns2:meta booleanValues="false" name="domainIsPrimarySource"/>
            </ns2:metadata>
        </sourceDomain>
        <ns2:metadata>
            <ns2:meta name="publishedSize" numericValues="5220.0"/>
            <ns2:meta name="publishedContentType" value="application/octet-stream"/>
            <ns2:meta name="pageTitle" value="Microsoft Office Downloads"/>
            <ns2:meta name="metaDescription" value=""/>
            <ns2:meta name="metaKeywords" value=""/>
            <ns2:meta name="contentTitle" value="Microsoft Office 2007 - x86"/>
            <ns2:meta name="contentDescription" value="..."/>
            <ns2:meta name="contentLocale" value="en_US"/>
            <ns2:meta binaryValue="Z8T6z6iebCazMPw7Lkvc/GqjOG8=" name="sourceContentHash"/>
            <ns2:meta name="sourceContentHashAlgorithm" value="SHA1"/>
        </ns2:metadata>
    </processSource>
</ns3:processPackageDataSet>

Remark: In case of the source metadata grows above ~4kb in size it can also be useful to link larger source content by attaching it to the main source using unique source identifiers.

At the time of writing, this is not a feature that is implemented by the GRID backend. While accepted by the ACL this has no effect on the way how a request is processed.

The following example illustrates the (currently unsupported) usage:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<ns3:processPackageDataSet xmlns:ns3="http://grid.trendmicro.com/services/level-0"
                           xmlns:ns2="http://grid.trendmicro.com/metadata/1.0"
                           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                           xsi:schemaLocation="http://grid.trendmicro.com/services/level-0 process-package-data-set.xsd"
  
                           processingSite="urn:trendmicro:grid:processing-site:acl-node:10.34.114.84"
                           processType="standard" processPriority="4"
                           processingStarted="2010-05-22T14:56:44.690+02:00"
                           referringJob="e4b3a77c-c68f-48b9-ba6a-83aa5f4ee238">
  
    <processSource internalURI="cifs://0ba5e77e9888db033e30f50014b43547582aed9c"
                   remoteURI="http://public.domain/path/to/file.exe">
        ...
        <ns2:metadata>
            ...
            <ns2:meta name="contentTitle" value="Microsoft Office 2007 - x86"/>
            <ns2:meta name="attachedContentDescriptions">
                <ns2:v>909124D3C7C3CAA6505420096905D488D30D379F</ns2:v>
                <ns2:v>D3C7C3CAA65D488D90912430D379F05420096905</ns2:v>
            </ns2:meta>
            ...
        </ns2:metadata>
    </processSource>
    <processSource internalURI="cifs://33e30f5888db0352aed9c40014b40ba5e77e9758"
                   remoteURI="http://public.domain/path/to/site-content.html">
        <sourceInformation temporary="false">
            <identifier sha1="909124D3C7C3CAA6505420096905D488D30D379F"/>
            ...
        </sourceInformation>
        ...
    </processSource>
    <processSource internalURI="cifs://f5888db0352a335e77e9758e30ed9c40014b40ba"
                   remoteURI="http://public.domain/path/to/other-site-content.html">
        <sourceInformation temporary="false">
            <identifier sha1="D3C7C3CAA65D488D90912430D379F05420096905"/>
            ...
        </sourceInformation>
        ...
    </processSource>
</ns3:processPackageDataSet>

Finalize Processing

The following message shows the content that is expected by the Access Layer to finish processing, on a single package. The Access Layer offers a SOAP service method called storeProcessingResultsAndFinalizeJobs(...) that excepts one or more of such messages in order to update the CoreDB using the contained process results.

Notes:

  • This message contains the content of a single package like a single installer archive or any other file-container that the processing system can decode.
  • When a package contains another package, a subsequent message of this type is required for every child package. Linkage between packages is realized by "packageMember" elements in the same way as with standard files.
  • A single "Initial Process Message" may produce many "Process Result Messages" depending on the contents of the initial processed package. The processing system is responsible for the creation of multiple messages.
  • A "packageMember" may contain detailed or basic information on the referenced file. Adding detailed information on a referenced package is a waste of space as the same information is also contained in the subsequent message describing the package.
  • Every "Process Results Message" requires a unique Job-ID. In case of a single "Initial Process Message" produces many "Process Result Messages" they should be assigned with a "SubJob-ID". See the SOAP service method prepareSubJob(UUID parentJobId).
  • Named elements inside the result message follow the naming rule as defined inside "Naming Elements".
  • The element <packageFamily/> (including <vendor/>) is optional. In case of vendor or package family entries are missing inside the CoreDB, the Access Layer will implicitly create the missing entries based on the names specified inside the retrieved "Process Result Message".
Process Result Message
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
<ns3:processPackageDataSet xmlns:ns2="http://grid.trendmicro.com/metadata/1.0"
 processPriority="4" processType="standard" processingSite="urn:trendmicro:grid:processing-site:acl-node:10.52.66.125"
 processingStarted="2014-02-10T01:05:56.722-08:00" referringJob="8a45321d-f2b9-4102-8d8f-742a4765f02c"
 xsi:schemaLocation="http://grid.trendmicro.com/services/level-0 process-package-data-set.xsd">
        <sourceInformation temporary="false">
            <identifier sha1="B5495CF3E3DBB053ACCAB9721537CCBBFE22C1E9"/>
            <lastModified>2014-02-10T01:05:56.722-08:00</lastModified>
            <contentTag>--sdfst23sdf53</contentTag>
        </sourceInformation>
        <sourceDomain name="public.domain">
            <ns2:metadata>
                <ns2:meta booleanValues="false" name="domainIsPrimarySource"/>
            </ns2:metadata>
        </sourceDomain>
        <ns2:metadata>
            <ns2:meta name="publishedSize" numericValues="32868.0"/>
            <ns2:meta name="publishedContentType" value="application/octet-stream"/>
            <ns2:meta name="pageTitle" value="Microsoft Office Downloads"/>
            <ns2:meta name="metaDescription" value=""/>
            <ns2:meta name="metaKeywords" value=""/>
            <ns2:meta name="contentTitle" value="Microsoft Office 2007 - x86"/>
            <ns2:meta name="contentDescription" value="..."/>
            <ns2:meta name="contentLocale" value="en_US"/>
            <ns2:meta binaryValue="0+glFMlByIjmTZk1agVRnctAaik=" name="sourceContentHash"/>
            <ns2:meta name="sourceContentHashAlgorithm" value="SHA1"/>
        </ns2:metadata>
    </processSource>
    <processedPackage>
        <packageFamily basename="microsoft:office">
            <vendor firstSeen="2014-02-10T01:05:56.722-08:00" name="microsoft">
                <displayName>Microsoft</displayName>
                <ns2:metadata>
                    <ns2:meta name="vendorReputation" value="high"/>
                </ns2:metadata>
            </vendor>
            <displayName>Microsoft Office</displayName>
            <ns2:metadata>
                <ns2:meta name="familyProductType">
                    <ns2:v>office-suite</ns2:v>
                    <ns2:v>productivity</ns2:v>
                    <ns2:v>email</ns2:v>
                </ns2:meta>
            </ns2:metadata>
        </packageFamily>
        <packageInformation familyName="microsoft:office" name="microsoft:office:2007:windows:x86:en_US"
         tags="office productivity email" vendorName="microsoft">
            <displayName>Microsoft Office 2007 EN (x86)</displayName>
            <packageFileInformation firstSeen="2014-02-10T01:05:56.722-08:00"
             lastProcessed="2014-02-10T01:05:56.722-08:00" lastRetrieved="2014-02-10T01:05:56.722-08:00"
             tags="installer executable signed clean"/>
        </packageInformation>
        <ns2:metadata>
            <ns2:meta name="scanVendors">
                <ns2:v>SCANNER_ALPHA_GEN</ns2:v>
                <ns2:v>SCANNER_BETA_GEN</ns2:v>
                <ns2:v>SCANNER_DIGISIG</ns2:v>
                <ns2:v>SCANNER_KASPERSKY</ns2:v>
                <ns2:v>SCANNER_MCAFEE</ns2:v>
                <ns2:v>SCANNER_MICROSOFT</ns2:v>
                <ns2:v>SCANNER_SOPHOS</ns2:v>
                <ns2:v>SCANNER_SYMANTEC</ns2:v>
            </ns2:meta>
            <ns2:meta booleanValues="true true true true true true true true" name="scanPassed"/>
            <ns2:meta name="scanVendorsVersion">
                <ns2:v>[01.34]ENG[8.910-1002]LPTPTN[5.101.00][[PATTERN_OUTDATED](5014 hrs 37 mins since last
                 update)]</ns2:v>
                <ns2:v>[02.52]ENG[8.910-1002]LPTPTN[5.101.00][[PATTERN_OUTDATED](5014 hrs 37 mins since last
                 update)]</ns2:v>
                <ns2:v>[31.45][[PATTERN_OUTDATED](5014 hrs 37 mins since last update)]</ns2:v>
                <ns2:v>[08.27]ENG[4.110-2000][[PATTERN_OUTDATED](5014 hrs 37 mins since last update)]</ns2:v>
                <ns2:v>[10.52]ENG[6.930-4001][[PATTERN_OUTDATED](5014 hrs 37 mins since last update)]</ns2:v>
                <ns2:v>[121.36]ENG[1.2110-1000][[PATTERN_OUTDATED](5014 hrs 37 mins since last update)]</ns2:v>
                <ns2:v>[01.72]ENG[6.20][[PATTERN_OUTDATED](5014 hrs 37 mins since last update)]</ns2:v>
            </ns2:meta>
        </ns2:metadata>
        <fileMetadata>
            <identifier md5="5A14E759F54E9A4B33BFE1F4A09726A8" sha1="D3E82514C941C888E64D99356A05519DCB406A29"/>
            <ns2:metadata>
                <ns2:meta name="companyName" value="Microsoft Corp."/>
                <ns2:meta name="originalFileName" value="Setup.exe"/>
                <ns2:meta name="internalName" value="Microsoft Office Setup"/>
                <ns2:meta name="fileSize" numericValues="27748.0"/>
                <ns2:meta name="fileVersion" value="12.0.6514.5000"/>
                <ns2:meta name="productName" value="2007 Microsoft Office System"/>
                <ns2:meta binaryValue="UFwhJKclp/c5cbUacKuzpQILw3RDQaPPdA9PRhmSk7A=" name="sha256"/>
                <ns2:meta
                 binaryValue="PrF9eybwmb30T36IqgTVm3sQ9u+UJwjGcq94JuC1mnRm/vrR1RvuQc0dOBGJcKoZGCnmcVyCOVhkf1VARqAXTA==" name="sha512"/>
                <ns2:meta booleanValues="true" name="scanPassed"/>
            </ns2:metadata>
        </fileMetadata>
    </processedPackage>
    <packageMember>
        <identifier fileName="word.exe" md5="74A64682EC187EAD01A4F0626734D9B9"
         sha1="8B6AC8CC2A528F96A86FF916912207B14A5F4362"/>
        <detailedInformation>
            <identifier md5="74A64682EC187EAD01A4F0626734D9B9" sha1="8B6AC8CC2A528F96A86FF916912207B14A5F4362"/>
            <ns2:metadata>
                <ns2:meta name="companyName" value="Microsoft Corp."/>
                <ns2:meta name="originalFileName" value="WinWord.exe"/>
                <ns2:meta name="internalName" value="Word"/>
                <ns2:meta name="fileSize" numericValues="46180.0"/>
                <ns2:meta name="fileVersion" value="12.0.6514.5000"/>
                <ns2:meta name="productName" value="2007 Microsoft Office System"/>
                <ns2:meta binaryValue="fjW2aMwNQnyy4Jas251DBfuNetqOx40gJz2MdZ+T7YI=" name="sha256"/>
                <ns2:meta
                 binaryValue="34hg70jRtKyosMh6MyiVflufkTCq+Qne62LKmCJ6zRhry+S3lLR7bLUquZZCS4wvJ0PYDNpVpghIq6XZFsNvIw==" name="sha512"/>
                <ns2:meta booleanValues="true" name="scanPassed"/>
            </ns2:metadata>
            <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
             lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
        </detailedInformation>
    </packageMember>
    <packageMember>
        <identifier fileName="excel.exe" md5="E478B190C53DC3FF65835E12EFF30736"
         sha1="58151BBC47B48834195598B37373B3B94E7B3B2E"/>
        <detailedInformation>
            <identifier md5="E478B190C53DC3FF65835E12EFF30736" sha1="58151BBC47B48834195598B37373B3B94E7B3B2E"/>
            <ns2:metadata>
                <ns2:meta name="companyName" value="Microsoft Corp."/>
                <ns2:meta name="originalFileName" value="Excel.exe"/>
                <ns2:meta name="internalName" value="Excel"/>
                <ns2:meta name="fileSize" numericValues="36964.0"/>
                <ns2:meta name="fileVersion" value="12.0.6514.5000"/>
                <ns2:meta name="productName" value="2007 Microsoft Office System"/>
                <ns2:meta binaryValue="RgPlilIkb34TfH6BVOVUmWEuGu1gtWW3D3gGmkg/8SA=" name="sha256"/>
                <ns2:meta
                 binaryValue="Ysg9jBQ51MKPImdX+Hf6uQsY0QI7Ab2/qZS1jVWdu3DJuW3aDLXVMi67kp5k7DvGKYdEmY4VcsBCzcJSREa1Eg==" name="sha512"/>
                <ns2:meta booleanValues="true" name="scanPassed"/>
            </ns2:metadata>
            <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
             lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
        </detailedInformation>
    </packageMember>
    <packageMember>
        <identifier fileName="sysctl.exe" md5="D58F95277434EC2540ABE126E540DDAB"
         sha1="89015F7337242CE12E18ADAF6F718EFD12348059"/>
        <detailedInformation>
            <identifier md5="D58F95277434EC2540ABE126E540DDAB" sha1="89015F7337242CE12E18ADAF6F718EFD12348059"/>
            <ns2:metadata>
                <ns2:meta name="companyName" value="Microsoft Corp."/>
                <ns2:meta name="originalFileName" value="sysctl.dll"/>
                <ns2:meta name="internalName" value="sysctl.dll"/>
                <ns2:meta name="fileSize" numericValues="73828.0"/>
                <ns2:meta name="fileVersion" value="12.0.6514.5000"/>
                <ns2:meta name="productName" value="2007 Microsoft Office System"/>
                <ns2:meta binaryValue="I/gDmLRNHLiUDbUF8XTsUW260wN0jczW08JRmz7qZH0=" name="sha256"/>
                <ns2:meta
                 binaryValue="drywkxKUfiJuE3alpjkoe27LfR6Teto8fn7jrsJIRIThb9KVLehIXtcAxhkbW+c8TT6VD1FZfLHE6aReyVr0DQ==" name="sha512"/>
                <ns2:meta booleanValues="true" name="scanPassed"/>
            </ns2:metadata>
            <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
             lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
        </detailedInformation>
    </packageMember>
    <packageMember>
        <identifier fileName="windgl.exe" md5="E7059891B89BB9EA00AF42EF9474EB3B"
         sha1="F41C9252EA529969F6C63FD45D8700D7F392CE7C"/>
        <detailedInformation>
            <identifier md5="E7059891B89BB9EA00AF42EF9474EB3B" sha1="F41C9252EA529969F6C63FD45D8700D7F392CE7C"/>
            <ns2:metadata>
                <ns2:meta name="companyName" value="Microsoft Corp."/>
                <ns2:meta name="originalFileName" value="windgl.dll"/>
                <ns2:meta name="internalName" value="windgl.dll"/>
                <ns2:meta name="fileSize" numericValues="85092.0"/>
                <ns2:meta name="fileVersion" value="12.0.6514.5000"/>
                <ns2:meta name="productName" value="2007 Microsoft Office System"/>
                <ns2:meta binaryValue="Z9fjy9tV2onfgAR1k/636SG/nRm6b1cHdsRzR0H1Q2Q=" name="sha256"/>
                <ns2:meta
                 binaryValue="vD5U0euP0eGi1edFFSDUkfpU+Stss2csqD2ZIHuRfXP6jokTRAnw8GGjXOs0csUt8EapnTFdMZEJixSgnibYdA==" name="sha512"/>
                <ns2:meta booleanValues="true" name="scanPassed"/>
            </ns2:metadata>
            <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
             lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
        </detailedInformation>
    </packageMember>
    <packageMember>
        <identifier fileName="readme.txt" md5="468DE2BCCCB786FE635B4D6D83FCF092"
         sha1="189A4AFD8208761A56BFDE823704B736E749E8D6"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="dummy.xcr" md5="F9657D3DCB8685619D77441F935414E6"
         sha1="A63FF8414F90875E6DD925AA7590C1A38FA14E01"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="example.doc" md5="98F4FB80391C774D92F9FDEC67CB62A9"
         sha1="2E2D8F6CA5E5413B664D372311C69235EC5457AF"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="license.txt" md5="21458CC5CF9566AF857DAA3BF97F7DC1"
         sha1="B619C2EB3606C7E276158C5DD437C35BD139B7A2"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="dsa.txt" md5="D97D811CA541D8AACA685BE6F7BDCDAE"
         sha1="9AB535068C49C1D035DCD23B4299078C9161B4FF"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="start.bat" md5="F95F1A5A0AA76CE0E9AB7850B9753D20"
         sha1="C6FBA16567A0DCEDF977FD55B98F674A26BC26A8"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="templates/default-template.dot" md5="A0DC3FB8BEC85E46A011D3E5BECC8BDA"
         sha1="143AA58ECAE1896A3E0749EE8D6956124619CB12"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="templates/default-template.dotx" md5="F97CF6C3D92ADEC03750B296DA52C7A1"
         sha1="CC1D7BFEB9531DF2CA959EFD66A5B7F2F3D6C2D3"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="artwork/logo.png" md5="669E7F5326475C87143CD77BB74C30D7"
         sha1="365F5D40FDBF25B8C781D88372FB4204CF82AD32"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
    <packageMember>
        <identifier fileName="artwork/splash.bmp" md5="89C5FDE55BEDC20DAF22D63ABCFA9394"
         sha1="2BDF5B91EB0F551C0CC45EC6ADE895276DDB6630"/>
        <information firstSeen="2014-02-10T01:05:56.722-08:00" lastProcessed="2014-02-10T01:05:56.722-08:00"
         lastRetrieved="2014-02-10T01:05:56.722-08:00" tags="clean"/>
    </packageMember>
</ns3:processPackageDataSet>

Examples on using the "Processing API"

The following list of examples shows different usage scenarios and how they translate to the usage of the processing API which is available as SOAP interface under http://host:port/ws/level-0/internal/processing?wsdl and http://host:port/ws/level-0/internal/sources?wsdl

Harvesting

Default Sequence Example

Default Harvesting Sequence

Note: In contrast to the example code below, the sequence diagram doesn't handle conditional uploads in order to make the diagram easier to read. How conditional uploads are handled can be read out of the example code below.

Setting Processing Priority

The system will schedule jobs with a priority of 1 to 7 where 7 is the highest and 4 is the default setting. Processing priority can be set at 3 locations, which are the Metadata blocks inside the elements SourceDomain, Source and Job. If priority is not set it defaults to 4, if it's set at multiple locations, the highest priority will be assigned.

Note: It's illegal to change the job priority after the job was started as the Access Layer doesn't have any control over the prioritization then.

Example Code (Multiple Options)

1
2
3
4
5
6
7
8
9
10
11
12
// Option A: Setting the priority on the source
Source source=sourceService.getSource(identifier);
source.getMetadata().addOrGet("jobPriority").value(7);
  
// Option B: Setting the priority on the job, version 1
UUID jobId=processingService.prepareJob();
processingService.setJobPriority(jobId,7);
  
// Option C: Setting the priority on the job, version 2
Job job=processingService.getJob(jobId);
job.getMetadata().addOrGet("jobPriority").value(7);
processingService.updateJob(job);

Processing

Initiate Processing - Data Flow Example

Legacy Processing

Initiate Processing with Legacy Support

Outlook on Processing without Legacy Support

Initiate Processing without Legacy Support

Collect Processing Results

Convert Legacy Results and continue with AppInt

Convert Legacy Results and continue with AppInt

Collect Final Results

Collect Final Results and send it against the Access Layer.

Monitoring

Processing can be monitored via the processing related interfaces that allow to query running jobs. The accuracy of the results may vary, however the interface can at least offer the states: running, finished and running for a very long time (probably stale) jobs.