kevinmevin commited on
Commit
50aa75b
·
1 Parent(s): 042fca8

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,12 +1,6 @@
1
  ---
2
- title: Demo MSE-CNN
3
- emoji: 📉
4
- colorFrom: yellow
5
- colorTo: pink
6
  sdk: gradio
7
- sdk_version: 3.44.3
8
- app_file: app.py
9
- pinned: false
10
  ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: Demo_MSE-CNN
3
+ app_file: demo.py
 
 
4
  sdk: gradio
5
+ sdk_version: 3.34.0
 
 
6
  ---
 
 
demo.py ADDED
@@ -0,0 +1,440 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """@package docstring
2
+
3
+ @file demo.py
4
+
5
+ @brief Demonstration of the application of the MSE-CNN
6
+
7
+ Note: In order to run this script, you have to do it inside the folder
8
+
9
+ @section libraries_demo Libraries
10
+ - msecnn
11
+ - train_model_utils
12
+ - cv2
13
+ - dataset_utils
14
+ - re
15
+ - sys
16
+ - numpy
17
+ - gradio
18
+ - torch
19
+ - custom_dataset
20
+ - PIL
21
+
22
+ @section classes_demo Classes
23
+ - None
24
+
25
+ @section functions_demo Functions
26
+ - setup_model()
27
+ - int2label(split)
28
+ - draw_partition(img, split, cu_pos, cu_size)
29
+ - split_fm(cu, cu_pos, split)
30
+ - partition_img(img, img_yuv)
31
+ - pipeline(img, text)
32
+ - main()
33
+
34
+ @section global_vars_demo Global Variables
35
+ - PATH_TO_COEFFS = "../../../model_coefficients/best_coefficients"
36
+ - LOAD_IMAGE_ERROR = "load_image_error.png"
37
+ - EXAMPLE_IMGS = ["example_img_1.jpeg", "example_img_2.jpeg"]
38
+ - CTU_SIZE = (128, 128)
39
+ - FIRST_CU_POS = torch.tensor([0, 0]).reshape(shape=(-1, 2))
40
+ - FIRST_CU_SIZE = torch.tensor([64, 64]).reshape(shape=(-1, 2))
41
+ - DEV = "cuda" if torch.cuda.is_available() else "cpu"
42
+ - QP = 32
43
+ - model = None
44
+ - COLOR = (0, 247, 255)
45
+ - LINE_THICKNESS = 1
46
+ - DEFAULT_TEXT_FOR_COORDS = "Insert CTU position in the image..."
47
+
48
+ @section todo_demo TODO
49
+ - Instead of obtaining the best split, do the thresholding and then split it until you find the right type of split
50
+
51
+ @section license License
52
+ MIT License
53
+ Copyright (c) 2022 Raul Kevin do Espirito Santo Viana
54
+ Permission is hereby granted, free of charge, to any person obtaining a copy
55
+ of this software and associated documentation files (the "Software"), to deal
56
+ in the Software without restriction, including without limitation the rights
57
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
58
+ copies of the Software, and to permit persons to whom the Software is
59
+ furnished to do so, subject to the following conditions:
60
+ The above copyright notice and this permission notice shall be included in all
61
+ copies or substantial portions of the Software.
62
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
63
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
64
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
65
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
66
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
67
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
68
+ SOFTWARE.
69
+
70
+ @section author_demo Author(s)
71
+ - Created by Raul Kevin Viana
72
+ - Last time modified is 2023-09-10 21:00:10.225508
73
+ """
74
+
75
+
76
+ # ==============================================================
77
+ # Imports
78
+ # ==============================================================
79
+
80
+ import gradio as gr
81
+ import cv2 as cv
82
+ import sys
83
+ import torch
84
+ from PIL import Image
85
+ import numpy as np
86
+ import re
87
+
88
+ sys.path.append("../")
89
+ import msecnn
90
+ import dataset_utils as du
91
+ import custom_dataset as cd
92
+ import train_model_utils as tmu
93
+
94
+
95
+ # ==============================================================
96
+ # Constants and Global Variables
97
+ # ==============================================================
98
+
99
+ PATH_TO_COEFFS = "../../../model_coefficients/best_coefficients"
100
+ LOAD_IMAGE_ERROR = "load_image_error.png"
101
+ EXAMPLE_IMGS = ["example_img_1.jpeg", "example_img_2.jpeg"]
102
+ CTU_SIZE = (128, 128)
103
+ FIRST_CU_POS = torch.tensor([0, 0]).reshape(shape=(-1, 2))
104
+ FIRST_CU_SIZE = torch.tensor([64, 64]).reshape(shape=(-1, 2))
105
+ DEV = "cuda" if torch.cuda.is_available() else "cpu"
106
+ QP = 32
107
+ model = None
108
+ COLOR = (0, 247, 255)
109
+ LINE_THICKNESS = 1
110
+ DEFAULT_TEXT_FOR_COORDS = "Insert CTU position in the image..."
111
+
112
+ # ==============================================================
113
+ # Functions
114
+ # ==============================================================
115
+
116
+ def setup_model():
117
+ """!
118
+ @brief Initializes and load the parameters of the MSE-CNN
119
+ """
120
+ # Initialize model
121
+ stg1_2 = msecnn.MseCnnStg1(device=DEV, QP=QP).to(DEV)
122
+ stg3 = msecnn.MseCnnStgX(device=DEV, QP=QP).to(DEV)
123
+ stg4 = msecnn.MseCnnStgX(device=DEV, QP=QP).to(DEV)
124
+ stg5 = msecnn.MseCnnStgX(device=DEV, QP=QP).to(DEV)
125
+ stg6 = msecnn.MseCnnStgX(device=DEV, QP=QP).to(DEV)
126
+ model = (stg1_2, stg3, stg4, stg5, stg6)
127
+
128
+ # Load model coefficients
129
+ model = tmu.load_model_parameters_eval(model, PATH_TO_COEFFS, DEV)
130
+
131
+ return model
132
+
133
+ def int2label(split):
134
+ """!
135
+ @brief Obtain the string that corresponds to an integer value of the split
136
+
137
+ @param [in] split: Integer number representing the split tht the model chose
138
+ @param [out] str_split: Name of the corresponding split
139
+ """
140
+ if split == 0:
141
+ return "Non-Split"
142
+ elif split == 1:
143
+ return "Quad-Tree"
144
+ elif split == 2:
145
+ return "Horizontal Binary Tree"
146
+ elif split == 3:
147
+ return "Vertical Binary Tree"
148
+ elif split == 4:
149
+ return "Horizontal Ternary Tree"
150
+ elif split == 5:
151
+ return "Vertical Ternary Tree"
152
+ else:
153
+ return "Something wrong happened!"
154
+
155
+ def draw_partition(img, split, cu_pos, cu_size):
156
+ """!
157
+ @brief Draw partition in image based in the split outputed by the model
158
+
159
+ @param [in] img: User's input image
160
+ @param [in] cu_pos: CU position
161
+ @param [in] cu_size: CU size
162
+ @param [in] split: Integer number representing the split that the model chose
163
+ @param [out] str_split: Name of the corresponding split
164
+ """
165
+ # Parameters to draw the lines
166
+ ver_line_length = cu_size[0]
167
+ hor_line_length = cu_size[1]
168
+
169
+ if split == 1:
170
+ line1_start = (cu_pos[0], cu_pos[1]+hor_line_length//2)
171
+ line1_end = (cu_pos[0]+ver_line_length, cu_pos[1]+hor_line_length//2)
172
+ line2_start = (cu_pos[0]+ver_line_length//2, cu_pos[1])
173
+ line2_end = (cu_pos[0]+ver_line_length//2, cu_pos[1]+hor_line_length)
174
+ img = cv.line(img, line1_start, line1_end, COLOR, LINE_THICKNESS)
175
+ img = cv.line(img, line2_start, line2_end, COLOR, LINE_THICKNESS)
176
+ elif split == 2:
177
+ line1_start = (cu_pos[0]+ver_line_length//2, cu_pos[1])
178
+ line1_end = (cu_pos[0]+ver_line_length//2, cu_pos[1]+hor_line_length)
179
+ # assert line1_start[0]-line1_end[0] == 0 or line1_start[1]-line1_end[1] == 0 # Make sure that the lines are either horizontal or vertical
180
+ img = cv.line(img, line1_start, line1_end, COLOR, LINE_THICKNESS)
181
+ elif split == 3:
182
+ line1_start = (cu_pos[0], cu_pos[1]+hor_line_length//2)
183
+ line1_end = (cu_pos[0]+ver_line_length, cu_pos[1]+hor_line_length//2)
184
+ img = cv.line(img, line1_start, line1_end, COLOR, LINE_THICKNESS)
185
+ elif split == 4:
186
+ line1_start = (cu_pos[0]+ver_line_length//3, cu_pos[1])
187
+ line1_end = (cu_pos[0]+ver_line_length//3, cu_pos[1]+hor_line_length)
188
+ line2_start = (cu_pos[0]+(ver_line_length*2)//3, cu_pos[1])
189
+ line2_end = (cu_pos[0]+(ver_line_length*2)//3, cu_pos[1]+hor_line_length)
190
+ img = cv.line(img, line1_start, line1_end, COLOR, LINE_THICKNESS)
191
+ img = cv.line(img, line2_start, line2_end, COLOR, LINE_THICKNESS)
192
+ elif split == 5:
193
+ line1_start = (cu_pos[0], cu_pos[1]+hor_line_length//3)
194
+ line1_end = (cu_pos[0]+ver_line_length, cu_pos[1]+hor_line_length//3)
195
+ line2_start = (cu_pos[0], cu_pos[1]+(hor_line_length*2)//3)
196
+ line2_end = (cu_pos[0]+ver_line_length, cu_pos[1]+(hor_line_length*2)//3)
197
+ img = cv.line(img, line1_start, line1_end, COLOR, LINE_THICKNESS)
198
+ img = cv.line(img, line2_start, line2_end, COLOR, LINE_THICKNESS)
199
+ else:
200
+ raise Exception("Something wrong happened!")
201
+
202
+ return img
203
+
204
+
205
+ def split_fm(cu, cu_pos, split):
206
+ """!
207
+ @brief Splits feature maps in specific way
208
+
209
+ @param [in] cu: Input to the model
210
+ @param [in] cu_pos: Coordinate of the CU
211
+ @param [in] split: Way to split CU
212
+ @param [out] cu_out: New Feature maps
213
+ @param [out] cu_pos_out: Position of the new CUs
214
+ """
215
+ # Initizalize list
216
+ if split == 0: # Non-split
217
+ cu_out = cu
218
+ cu_pos = [cu_pos]
219
+
220
+ elif split == 1: # Quad-tree
221
+ # Split CU and add to list
222
+ cu_1 = torch.split(cu, cu.shape[-2]//2, -2)
223
+ cu_2 = torch.split(cu_1[1], cu_1[1].shape[-1]//2, -1)
224
+ cu_1 = torch.split(cu_1[0], cu_1[0].shape[-1]//2, -1)
225
+ cu_out = cu_1 + cu_2
226
+ cu_pos = [[cu_pos[0], cu_pos[1]], [cu_pos[0], cu_pos[1]+cu.shape[-1]//2],
227
+ [cu_pos[0]+cu.shape[-2]//2, cu_pos[1]], [cu_pos[0]+cu.shape[-2]//2, cu_pos[1]+cu.shape[-1]//2]]
228
+
229
+ elif split == 2: # HBT
230
+ # Split CU and add to list
231
+ cu_out = torch.split(cu, cu.shape[-2]//2, -2)
232
+ cu_pos = [[cu_pos[0], cu_pos[1]], [cu_pos[0]+cu.shape[-2]//2, cu_pos[1]]]
233
+
234
+ elif split == 3: # VBT
235
+ # Split CU and add to list
236
+ cu_out = torch.split(cu, cu.shape[-1]//2, -1)
237
+ cu_pos = [[cu_pos[0], cu_pos[1]], [cu_pos[0], cu_pos[1]+cu.shape[-1]//2]]
238
+
239
+ elif split == 4: # HTT
240
+ # Split CU and add to list
241
+ cu_out = torch.split(cu, cu.shape[-2]//3, -2)
242
+ cu_pos = [[cu_pos[0], cu_pos[1]], [cu_pos[0]+cu.shape[-2]//3, cu_pos[1]], [(2*cu.shape[-2])//3+cu_pos[0], cu_pos[1]]]
243
+
244
+
245
+ elif split == 5: # VTT
246
+ # Split CU and add to list
247
+ cu_out = torch.split(cu, cu.shape[-1]//3, -1)
248
+ cu_pos = [[cu_pos[0], cu_pos[1]], [cu_pos[0], cu_pos[1]+cu.shape[-1]//3], [(2*cu.shape[-1])//3+cu_pos[0], cu_pos[1]]]
249
+
250
+ else:
251
+ raise Exception("This can't happen! Wrong split mode number: ", str(split))
252
+
253
+ if type(cu_out) is tuple:
254
+ if len(cu_out) != 1:
255
+ cu_out = torch.cat(cu_out)
256
+ else:
257
+ cu_out = cu_out[0]
258
+
259
+ return cu_out, cu_pos
260
+
261
+ def partition_img(img, img_yuv):
262
+ """!
263
+ @brief Partitions a full 128x128 CTU and draws the partition in the original image
264
+
265
+ TODO: Instead of obtaining the best split, do the thresholding and then split it until you find the right type of split
266
+
267
+ @param [in] img: Image in BGR
268
+ @param [in] img_yuv: Image in YUV
269
+ @param [in] stg: Current stage being partitioned
270
+ @param [in] cu_pos: Current stage being partitioned
271
+ @param [in] cu_size: Current stage being partitioned
272
+ @param [out] img: Image in with partitions drawn to it
273
+ """
274
+ global model
275
+ # Stage 1 and 2
276
+ pos_1 = torch.tensor([[0, 0]])
277
+ pos_2 = torch.tensor([[0, 64]])
278
+ pos_3 = torch.tensor([[64, 0]])
279
+ pos_4 = torch.tensor([[64, 64]])
280
+ split_1, CUs_1, ap_1 = model[0](img_yuv, FIRST_CU_SIZE, pos_1)
281
+ split_2, CUs_2, ap_2 = model[0](img_yuv, FIRST_CU_SIZE, pos_2)
282
+ split_3, CUs_3, ap_3 = model[0](img_yuv, FIRST_CU_SIZE, pos_3)
283
+ split_4, CUs_4, ap_4 = model[0](img_yuv, FIRST_CU_SIZE, pos_4)
284
+ all_cus_stg1 = [(split_1, CUs_1, ap_1, (0, 0)), (split_2, CUs_2, ap_2, (0, 64)),
285
+ (split_3, CUs_3, ap_3, (64, 0)), (split_4, CUs_4, ap_4, (64, 64))]
286
+ img = draw_partition(img, 1, (0, 0), (128, 128))
287
+
288
+ # Stage 2: spliting
289
+ for cus_stg1 in all_cus_stg1:
290
+ split_stg1, cu_stg1, ap_stg1, pos_stg1 = cus_stg1
291
+ split_stg1 = tmu.obtain_mode(split_stg1)
292
+ if split_stg1 == 0:
293
+ continue
294
+ # compute new cus
295
+ try:
296
+ cu_out_2, cu_pos_2 = split_fm(cu_stg1, pos_stg1, split_stg1)
297
+ except RuntimeError:
298
+ # Weird partition happened
299
+ continue
300
+ # draw partition to original image
301
+ img = draw_partition(img, split_stg1, pos_stg1, (cu_stg1.shape[-2], cu_stg1.shape[-1]))
302
+
303
+ all_cus_stg2 = [(cu_out_2[idx, :, :, :].unsqueeze(0), ap_stg1, cu_pos_2[idx]) for idx in range(cu_out_2.shape[0])]
304
+
305
+ # Stage 3
306
+ for cus_stg2 in all_cus_stg2:
307
+ cu_stg2, ap_stg2, pos_stg2 = cus_stg2
308
+ pred_stg3, cu_stg3, ap_stg3 = model[1](cu_stg2, ap_stg2)
309
+ pred_stg3 = tmu.obtain_mode(pred_stg3)
310
+ # ap_stg3 = ap_stg3.item()
311
+ if pred_stg3 == 0:
312
+ continue
313
+ # compute new cus
314
+ try:
315
+ cu_out_3, cu_pos_3 = split_fm(cu_stg3, pos_stg2, pred_stg3)
316
+ except RuntimeError:
317
+ # Weird partition happened; skip
318
+ continue
319
+ # draw partition to original image
320
+ img = draw_partition(img, pred_stg3, pos_stg2, (cu_stg3.shape[-2], cu_stg3.shape[-1]))
321
+
322
+ all_cus_stg3 = [(cu_out_3[idx, :, :, :].unsqueeze(0), ap_stg3, cu_pos_3[idx]) for idx in range(cu_out_3.shape[0])]
323
+
324
+ # Stage 4
325
+ for cus_stg3 in all_cus_stg3:
326
+ cu_stg3, ap_stg3, pos_stg3 = cus_stg3
327
+ pred_stg4, cu_stg4, ap_stg4 = model[2](cu_stg3, ap_stg3)
328
+ pred_stg4 = tmu.obtain_mode(pred_stg4)
329
+ # ap_stg4 = ap_stg4.item()
330
+ if pred_stg4 == 0:
331
+ continue
332
+
333
+ # compute new cus
334
+ try:
335
+ cu_out_4, cu_pos_4 = split_fm(cu_stg4, pos_stg3, pred_stg4)
336
+ except RuntimeError:
337
+ # Weird partition happened; skip
338
+ continue
339
+ # draw partition to original image
340
+ img = draw_partition(img, pred_stg4, pos_stg3, (cu_stg4.shape[-2], cu_stg4.shape[-1]))
341
+ all_cus_stg4 = [(cu_out_4[idx, :, :, :].unsqueeze(0), ap_stg4, cu_pos_4[idx]) for idx in range(cu_out_4.shape[0])]
342
+
343
+ # Stage 5
344
+ for cus_stg4 in all_cus_stg4:
345
+ cu_stg4, ap_stg4, pos_stg4 = cus_stg4
346
+ pred_stg5, cu_stg5, ap_stg5 = model[3](cu_stg4, ap_stg4)
347
+ pred_stg5 = tmu.obtain_mode(pred_stg5)
348
+ # ap_stg5 = ap_stg5.item()
349
+ if pred_stg5 == 0:
350
+ continue
351
+ # compute new cus
352
+ try:
353
+ cu_out_5, cu_pos_5 = split_fm(cu_stg5, pos_stg4, pred_stg5)
354
+ except RuntimeError:
355
+ # Weird partition happened; skip
356
+ continue
357
+ # draw partition to original image
358
+ img = draw_partition(img, pred_stg5, pos_stg4, (cu_stg5.shape[-2], cu_stg5.shape[-1]))
359
+
360
+ all_cus_stg5 = [(cu_out_5[idx, :, :, :].unsqueeze(0), ap_stg5, cu_pos_5[idx]) for idx in range(cu_out_5.shape[0])]
361
+
362
+ # Stage 6
363
+ for cus_stg5 in all_cus_stg5:
364
+ cu_stg5, ap_stg5, pos_stg5 = cus_stg5
365
+ pred_stg6, cu_stg6, ap_stg6 = model[4](cu_stg5, ap_stg5)
366
+ pred_stg6 = tmu.obtain_mode(pred_stg6)
367
+ # ap_stg6 = ap_stg6.item()
368
+ if pred_stg6 == 0:
369
+ continue
370
+ # draw partition to original image
371
+ img = draw_partition(img, pred_stg6, pos_stg5, (cu_stg6.shape[-2], cu_stg6.shape[-1]))
372
+
373
+ return img
374
+
375
+ def pipeline(img, text):
376
+ """!
377
+ @brief Pipeline to implement the functionalities to demonstrate the potential of the MSE-CNN
378
+
379
+ @param [in] img: Image in RGB
380
+ @param [out] mod_img: Modified image with drawings into it in RGB
381
+ @param [out] best_split: Best split (BTV, BTH, TTV, TTH, Non-split, QT)
382
+ """
383
+ global model
384
+
385
+ # Obtain coordinates of the CTU
386
+ coords = re.findall(r"\d+", text)
387
+ coords = list(map(lambda x: int(x), coords))
388
+
389
+ # In case nothing is submitted, return a default image and text
390
+ if type(img) is type(None) or type(text) is type(None) or text == DEFAULT_TEXT_FOR_COORDS:
391
+ img_error = Image.open(LOAD_IMAGE_ERROR) # Replace with the path to your first image file
392
+ img_error = np.array(img_error)
393
+ return img_error, "Load the image first and also make sure you specify the position of the CTU!"
394
+
395
+ # Convert image to appropriate size
396
+ img = img[coords[0]:coords[0]+128, coords[1]:coords[1]+128, :]
397
+ if img.shape[0] % 2 != 0:
398
+ img = img[: img.shape[0]-1, :, :]
399
+ if img.shape[1] % 2 != 0:
400
+ img = img[:, :img.shape[1]-1, :]
401
+
402
+ # convert to yuv
403
+ img_yuv = cv.cvtColor(img, cv.COLOR_RGB2YUV_I420)
404
+ # convert to pytorch tensor
405
+ img_yuv = torch.from_numpy(img_yuv)
406
+ # obtain luma channel
407
+ _, ctu_y, _, _ = cd.get_cu_v2(img_yuv, CTU_SIZE, (0, 0), CTU_SIZE)
408
+ # change shape
409
+ ctu_y = torch.reshape(ctu_y, (1, 1, 128, 128)).to(DEV).float()
410
+
411
+ # Load model
412
+ model = setup_model()
413
+
414
+ # Partition Image
415
+ img = partition_img(img, ctu_y)
416
+
417
+ return img, "Partitioned Image"
418
+
419
+ def main():
420
+ with open("description.md", encoding="utf-8") as f:
421
+ description = f.read()
422
+
423
+ in_text_box = gr.Textbox(value=DEFAULT_TEXT_FOR_COORDS, label="Coordinates of CTU", info="You have to provide two numbers indicating the position of the CTU in the image")
424
+ in_image = gr.Image(label="Input image", info="Either use the example image or an image of your choosing")
425
+ out_text_box = gr.Textbox(label="Completion Message")
426
+ out_image = gr.Image(label="Partitioned CTU", info="Result of partitioning using MSE-CNN")
427
+
428
+ demo = gr.Interface(fn=pipeline, inputs=[in_image, in_text_box], examples=[[EXAMPLE_IMGS[0], "300, 800"], [EXAMPLE_IMGS[0], "100, 200"], [EXAMPLE_IMGS[1], "600, 925"], [EXAMPLE_IMGS[1], "450, 1600"]], thumbnail="msecnn_model.png",
429
+ outputs=[out_image, out_text_box], description=description, debug=True, # inbrowser=True,
430
+ title="MSE-CNN Demo", image="msecnn_model.png")
431
+
432
+ demo.launch()
433
+
434
+
435
+ # ==============================================================
436
+ # Main
437
+ # ==============================================================
438
+
439
+ if __name__ == "__main__":
440
+ main()
description.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ With this demo you will be able to understand better how the MSE-CNN works and its goal! :D
2
+
3
+ <center>
4
+ <img src="file/msecnn_model.png" width=500 />
5
+ </center>
6
+
7
+ ## Tutorial
8
+ To use this demo follow these steps ;)
9
+
10
+ 1. Load an image from your PC. Since the model needs to be fed 128x128 images, if your image is larger than that, then the 128 by 128 area within your image top left will be loaded. Additionally, you can tell the app the coordinates of the area of the snapshot you want to see. The coordinates are related to the image's upper-left corner, therefore position 0,0 refers to that particular area. Furthermore, you must first specify the height position (y axis) and then the width position (x axis).
11
+ 2. Click "Submit" to pass the image through the model.
12
+ 3. After the previous step, an image will be displayed with the best way, according with the model, to partition that section of the image. The possible ways to partition an block (coding unit, CU) in VVC is in Quartenary Tree (QT), Binary Tree Horizontal or Vertical (BTH or BTV), Ternary Tree Horizontal or Vertical (TTH, TTV) and Non-split.
13
+
14
+ **Note**: This demo implementation has some limitations, such as the fact that the model occasionally makes illogical predictions. For instance, splitting a 16x32 CU using VTT is incorrect. This occurs as a result of the model's inherent constraints as well as the fact that only the best split is being chosen. One way to reduce this behaviour is to evaluate not only the optimal split but also alternate splits. This can be accomplished by, for instance, applying the multi-thresholding method to the model's forecasts and determining the splits that are most likely to occur. Additionally, when unreasonable splits are predicted, the code immediately halts the partitioning of that particular block.
example_img_1.jpeg ADDED
example_img_2.jpeg ADDED
load_image_error.png ADDED
msecnn_model.png ADDED