Please note: We are currently experiencing some performance issues across the site, and some pages may be slow to load. We are working on restoring normal service soon. Importing new articles from Word documents is also currently unavailable. We apologize for any inconvenience.

Future vehicular networks must ensure ultra-reliable low-latency communication (URLLC) for the timely delivery of safety-critical information. Previously proposed resource allocation schemes for URLLC mostly rely on centralized optimization-based algorithms and cannot guarantee the reliability and latency requirements of vehicle-to-vehicle (V2V) communications. This paper investigates the joint power and blocklength allocation to minimize the worst-case decoding-error probability in the finite blocklength (FBL) regime for a URLLC-based V2V communication network. We formulate the problem as a non-convex mixed-integer nonlinear programming problem (MINLP). We first develop a centralized optimization theory-based algorithm based on the derivation of the joint convexity of the decoding error probability in the blocklength and transmit power variables within the region of interest. Next, we propose a two-layered multi-agent deep reinforcement learning based centrally trained and distributively executed framework. The first layer involves establishing multiple deep Q-networks (DQNs) at the central trainer to train the local DQNs for block length optimization. The second layer involves an actor-critic network and utilizes the deep deterministic policy-gradient (DDPG)-based algorithm to train the local actor network for each V2V link. Simulation results demonstrate that the proposed distributed scheme can achieve close-to-optimal solutions with a much lower computational complexity than the centralized optimization based solution.