the Msg3 UL grant requires the TTI in which the RAR has been received
to calcualte the correct timing. There was a race between PHY and Stack
thread.
This patch circumvents the issue by removing a PHY state member that only holds
the RAR Rx timing. In the new interface the RA proc passes the Rx TTI
to the PHY again when setting the UL grant so the PHY can calculate the
correct timing without any state.
fixes stack use after free detected by ASAN
2021-08-31T17:21:44.885938 [MAC-NR ] [D] [ 0] Building new MAC PDU (9 B)
==10908==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffc481b5340 at pc 0x563c0486d489 bp 0x7ffc481b4470 sp 0x7ffc481b4460
READ of size 1 at 0x7ffc481b5340 thread T0
#0 0x563c0486d488 in srsran::mac_sch_subpdu_nr::to_string(fmt::v7::basic_memory_buffer<char, 500ul, std::allocator<char> >&) (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x139488)
#1 0x563c0486db87 in srsran::mac_sch_pdu_nr::to_string(fmt::v7::basic_memory_buffer<char, 500ul, std::allocator<char> >&) (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x139b87)
#2 0x563c0481c127 in srsue::mux_nr::get_pdu(unsigned int) (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0xe8127)
#3 0x563c0484e62b in srsue::ul_harq_entity_nr::ul_harq_process_nr::new_grant_ul(srsue::mac_interface_phy_nr::mac_nr_grant_ul_t const&, bool const&, srsue::mac_interface_phy_nr::tb_action_ul_t*) (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x11a62b)
#4 0x563c04850de4 in srsue::ul_harq_entity_nr::new_grant_ul(srsue::mac_interface_phy_nr::mac_nr_grant_ul_t const&, srsue::mac_interface_phy_nr::tb_action_ul_t*) (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x11cde4)
#5 0x563c047bb004 in srsue::mac_nr::new_grant_ul(unsigned int, srsue::mac_interface_phy_nr::mac_nr_grant_ul_t const&, srsue::mac_interface_phy_nr::tb_action_ul_t*) (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x87004)
#6 0x563c04760cdc in msg3_test() (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x2ccdc)
#7 0x563c0475f762 in main (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x2b762)
#8 0x7fae1cf400b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
#9 0x563c047601bd in _start (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x2c1bd)
Address 0x7ffc481b5340 is located in stack of thread T0 at offset 320 in frame
#0 0x563c0486d78f in srsran::mac_sch_pdu_nr::to_string(fmt::v7::basic_memory_buffer<char, 500ul, std::allocator<char> >&) (/home/ubuntu/workspace/srslte_ubuntu_20.04_pull_request/srslte/build/srsue/src/stack/mac_nr/test/mac_nr_test+0x13978f)
setting the new PRACH params (writing the the local var) needs to protected as well
because it is called from the RRC context and the PHY worker will call configure_prach_params()
if it sees changes to it.
NAS states and substates maybe be requested from other threads so
they need to be protected.
Note that the caller still needs to hold it's own mutex if different
actions are required based on the state.
the code hasn't been maintained for a while an likely needs to be
adapted for a real-world scenarios.
in order to avoid having to maintain two MAC/PHY interfaces we
remove the code from now.
some commands were executed from the calling thread which may lead
to concurrent access to members. Detected by TSAN. The patch
moves all remaining calls (the majority was alread moved) to the
Stack task queue.
fixed through the right usage of mutexes in both TTCN PHY and syssim.
nested mutex locking is solved by calling SS from the PHY after
releaseing the PHY lock again.
* Protect PHY SR signal management in a class
* Protect intra_freq_meas vector
* Protect cell and srate shared variables in thread-safe classes
* srsue,srsenb: include TSAN options header
* Protect ue_rnti_t and rnti scheduling windows behind thread-safe classes
* Protect access to state variable in sync_state
* Protect access to metrics configuration
* Protect access to is_pending_sr
* Protect access to UE prach worker
* Protect UE mux
* Avoid unlocking mutex twice
* Fix data races in RF/ZMQ
* Fix data races in intra_measure and PHY
* Fix minor data races in MAC
* Make TSAN default behaviour to not halt on error
* Fix blocking in intra cell measurement
* Address comments
Co-authored-by: Andre Puschmann <andre@softwareradiosystems.com>
the EPS bearer manager was only informed when a single DRB
was removed but not when entering idle which requires to
remove all bearers.
This cause the service request to fail.
this is a rather large commit that is hard to split because
it touches quite a few components.
It's a preparation patch for adding NR split bearers in the next
step.
We realized that managing RLC and PDCP bearers for both NR and LTE
in the same entity doesn't work. This is because we use the LCID
as a key for all accesses. With NR dual connectivity however we
can have the same LCID active at the same time for both LTE and NR
carriers.
The patch solves that by creating a dedicated NR instance for RLC/PDCP
in the stack. But then the question arises for UL traffic on, e.g. LCID 4
what PDCP instance the GW should use for pushing SDUs. It doesnt' know
that. And in fact it doesn't need to. It just needs to know EPS
bearer IDs. So the next change was to remove the knowledge of what
LCIDs are from the GW. Make is agnostic and only work on EPS bearer IDs.
The handling and mapping between EPS bearer IDs and LCIDs for LTE
or NR (mainly PDCP for pushing data) is done in the Stack because
it has access to both.
The NAS also has a EPS bearer map but only knows about default and
dedicated bearers. It doesn't know on which logical channels they
are transmitted.
this patch mainly modernizes the bearer creation to use smart pointers.
that allows to simplify the error handling.
ue_stack is changed to match new interface. This commit compiles
but doesn't work.
the issue let to unwanted log warning at the end of the UE
execution when the PHY was still pushing DL PDUs while MAC
was already stopped.
This fixes#3003
we had it returning int but had a bug in using the return value properly,
i.e. handling when -1 was returned in RLC TM.
Thinking about it more, it doesn't make sense to have a negative return
value here anyway. Either the RLC can return a PDU or not. If it can't the
returned lenght is zero.
general-purpose method for lower layers to signal protocol
failures to higher layers, i.e. RRC.
In the current case, implement a direct release of the UE (enb) or
a reestablishment (UE).
races detected with TSAN, primarily the ue_sync object which isn't
thread-safe is accessed by all workers to set the CFO and by the
sync thread to receive samples (which read the CFO).
The patch introduces shadow variables that are updates from the
main thread before/after the sync is executed. The atomic shadow
variables can then be read from otherthreads without holding a
mutex, i.e. blocking the sync.
in the mac_test the tb_decoded() method was called twice for
the 2nd codeword, causig TSAN to complain about an unlocked mutex
being unlocked.
The patch resolves the potential issue only calling tb_decoded
for a grant/tb thas has a non-zero MCS.
The patch also adjusts the reset function to have a safe and "unsafe"
version to be called from inside the class, similar to other
classes where we do the same.
this is an attempt to fix#2850 by defering the transmission of
the connection setup complete until the PHY has applied
the dedicated config in the connection setup.
- the consumer does multi-staged waiting:
1. spins first across all queues in a RR fashion
2. each queue access is done with a try_lock.
3. if the try_lock fails, it increases the number of spins needed
2. if no queue had data, the consumer sleeps for 100 usec.
- no differentiation between queues, in terms of notification features