Training in progress, step 200

f7702e0 verified 5 months ago

5.8 kB

	[34m[1mwandb[39m[22m: [33mWARNING[39m Serializing object of type dict that is 295000 bytes
	[34m[1mwandb[39m[22m: [33mWARNING[39m Serializing object of type dict that is 147552 bytes
	/tmp/ipykernel_34/2312826469.py:2: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	input_ids = [torch.tensor(item['input_ids'], dtype=torch.long) for item in batch]
	/tmp/ipykernel_34/2312826469.py:3: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	pixel_values = [torch.tensor(item['pixel_values'], dtype=torch.float) for item in batch]
	/tmp/ipykernel_34/2312826469.py:4: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	attention_mask = [torch.tensor(item['attention_mask'], dtype=torch.long) for item in batch]
	/tmp/ipykernel_34/2312826469.py:5: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	token_type_ids = [torch.tensor(item['token_type_ids'], dtype=torch.long) for item in batch]
	/tmp/ipykernel_34/2312826469.py:6: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	labels = [torch.tensor(item['labels'], dtype=torch.long) for item in batch]
	/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py:436: FutureWarning: Passing the following arguments to `Accelerator` is deprecated and will be removed in version 1.0 of Accelerate: dict_keys(['dispatch_batches', 'split_batches', 'even_batches', 'use_seedable_sampler']). Please pass an `accelerate.DataLoaderConfiguration` instead:
	dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)
	warnings.warn(
	[34m[1mwandb[39m[22m: [33mWARNING[39m Serializing object of type dict that is 295000 bytes
	[34m[1mwandb[39m[22m: [33mWARNING[39m Serializing object of type dict that is 147552 bytes
	/tmp/ipykernel_34/2975308579.py:2: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	input_ids = [torch.tensor(item['input_ids'], dtype=torch.long) for item in batch]
	/tmp/ipykernel_34/2975308579.py:3: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	pixel_values = [torch.tensor(item['pixel_values'], dtype=torch.float) for item in batch]
	/tmp/ipykernel_34/2975308579.py:4: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	attention_mask = [torch.tensor(item['attention_mask'], dtype=torch.long) for item in batch]
	/tmp/ipykernel_34/2975308579.py:5: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	token_type_ids = [torch.tensor(item['token_type_ids'], dtype=torch.long) for item in batch]
	/tmp/ipykernel_34/2975308579.py:6: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
	labels = [torch.tensor(item['labels'], dtype=torch.long) for item in batch]
	Some weights of ViltForQuestionAnswering were not initialized from the model checkpoint at dandelin/vilt-b32-finetuned-vqa and are newly initialized because the shapes did not match:
	- classifier.3.weight: found shape torch.Size([3129, 1536]) in the checkpoint and torch.Size([5630, 1536]) in the model instantiated
	- classifier.3.bias: found shape torch.Size([3129]) in the checkpoint and torch.Size([5630]) in the model instantiated
	You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
	/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py:436: FutureWarning: Passing the following arguments to `Accelerator` is deprecated and will be removed in version 1.0 of Accelerate: dict_keys(['dispatch_batches', 'split_batches', 'even_batches', 'use_seedable_sampler']). Please pass an `accelerate.DataLoaderConfiguration` instead:
	dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)
	warnings.warn(
	[34m[1mwandb[39m[22m: [33mWARNING[39m Serializing object of type dict that is 295000 bytes
	[34m[1mwandb[39m[22m: [33mWARNING[39m Serializing object of type dict that is 147552 bytes
	/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
	warnings.warn('Was asked to gather along dimension 0, but all '
	/opt/conda/lib/python3.10/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.